REMARKS 

Introductory Comments: 

Claims 8, 9, 10, 1 1, 12, and 13 were examined in the Office Action under reply. Claims 
8, 9, 10, 11, 12, and 13 stand rejected under 35 U.S.C. §101. Claims 9, 10, 11, 12, and 13 stand 
rejected under 35 U.S.C. §112, first paragraph. Claims 10, 11, 12 and 13 stand rejected under 
35 U.S.C. §102(b) or (e) and claims 8, 9, 10, 1 1, 12, and 13 stand rejected under 35 U.S.C. 
§ 102(b) as anticipated. These rejections are believed to be overcome by the above amendments 
and are otherwise traversed for the reasons discussed below. 

Overview of the Amendments: 

Claims 1, 4-8 and 14-17 have been canceled as directed to non-elected subject matter. 
Cancellation of claims 1, 4-8 and 14-17 is without prejudice, without intent to abandon any 
originally claimed subject matter, and without intent to acquiesce in any rejections of the records. 
Applicants reserve the right to bring the canceled claim again in a related application. 

Claims 9-13 have been amended to in order to recite the subject invention with greater 
particularity. Specifically, claims 9 and 11-13 have been amended to read on the elected 
polynucleotide SEQ ID NOs, make minor wording changes, and correct obvious typographical 
errors. Additionally, recitations from claims 10, 11, and 12 have been incorporated into claim 9. 

New claims 18-24 have been added by this amendment. Support for claim 18, directed at 
a recombinant vector, is found in the specification at page 5, lines 7-8; and pages 8-28. Support 
for claim 19, directed at a host cell, is found in the specification at page 5, lines 7-8; and pages 8- 
28. Support for claim 20, directed at producing a recombinant polypeptide, is found in the 
specification at page 5, lines 22-24; and pages 8-28. Claim 21 corresponds to previous claim 9 
with the non-elected sequences eliminated. Calims 22-24 correspond to claims 18-20 desribed 
above but depend from new claim 21 . 

Formal Matters: 
Priority 

The Examiner stated that priority has not been granted to the claimed international 
application PCT/IB98/01665 because no certified copy of the application was submitted to the 
Office. In lieu of submitting a costly certified copy of the 524 page application, applicants 
append hereto (1) a copy of Form PCT/IB308 "Notice Informing the Applicant of the 
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Communication of the International Application to the Designated Offices" and (2) a copy of the 
cover page of PCT/IB98/01665 indicating an October 9, 1998 filing date. In accordance with 
PCT Rule 47.1(c), third sentence, those Offices will accept the present Notice as conclusive 
evidence that the communication of the international application has duly taken place on the date 
of mailing indicated above and no copy of the international application is required to be 
furnished by the applicant to the designated Office(s) (emphasis added). As indicated on the 
enclosed form PCT7IB308, notice was given to the U.S. by the International Bureau, thus no 
copy of the international application is required. 

Sequencing Rules 

The Examiner stated that the application failed to comply with the requirements of 37 
CFR §§ 1.821-1.825 because certain sequences were not listed in the Sequence Listing. 
Applicants are providing a substitute specification that includes the information required by 37 
CFR §§ 1.821-1.825. Specifically, applicants are providing a new Sequence Listing with all 
sequences disclosed in the filed specification. Additionally, the substitute specification includes 
sequence identifiers in the proper format at each sequence. In accordance with 37 CFR 1.821(f), 
the content of the sequence listing information recorded in computer readable form (submitted 
herewith) is identical to the written (paper) Sequence Listing (submitted herewith). The 
Sequence Listing includes no new matter. 

Drawings 

Applicants are submitting corrected drawings in accordance with Form PTO 948 under 
separate cover. 

Objection to Title 

The Examiner objected to the title, "Neisserial Antigens" as being not descriptive of the 
invention to which the claims are directed. In the attached substitute specification, the title of the 
application has been amended to "Neisserial Polynucleotides" to more clearly indicate the 
invention to which the elected claims are directed. 
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Brief Description of the Drawings 

Applicants appreciate the Examiner pointing out the need for a recitation of subparts for 
various figures in the "Brief Description of the Drawings". The substitute specification includes 
a recitation of subparts where appropriate. 

Response to Claim Rejections 

Claim Rejections under 35 U.S.C. §101 

The Examiner rejected claims 8, 9, 10, 1 1, 12 and 13 under 35 U.S.C.§101, asserting "the 
claimed invention is directed to non-statutory subject matter" (Office Action, page 4). In order 
to facilitate prosecution, the term "isolated" has been added to claims 9, 10, 1 1, 12 and 13, as 
suggested by the Examiner. Accordingly, the rejections under 35 U.S.C. §101 should be * 
withdrawn. 

Claim Rejections under 35 U.S.C §112, First Paragraph 
The Examiner has rejected claims 9, 10, 1 1, 12 and 13 under 35 U.S.C.§1 12, first 
paragraph, asserting that the claims contain "subject matter which was not described in the 
specification in such a way as to reasonably convey to one skilled in the relevant art that the 
inventors, at the time the application was filed, had possession of the claimed invention" (Office 
Action, page 5). 

The Examiner argues: "given the broad scope of the claims due to the use of the open 
language 'comprising', they are drawn to a genus: any nucleotide that minimally contains the 
sequences of the claimed SEQ ID NOs, including full length genes, any fusion constructs, etc" 
(Office Action, page 5). 

Further, the Examiner argues that the: 

"mere disclosure of a species: sequences of the elected SEQ ID NOs, does not 
provide adequate written description of the claimed genus. In view of the level of 
knowledge and skill in the art, one skilled in the art would not recognize from the 
disclosure that the applicant was in possession the genus of DNAs or RNAs 
encompassed in the claims which the sequences of the claimed SEQ ID NOs." 
(Office Action, page 6). 
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However, applicants respectfully disagree. 

In order to comply with the written description requirement, an applicant's specification 
must convey with reasonable clarity to those skilled in the art that, as of the filing date sought, he 
or she was in possession of the invention, i.e., whatever is now claimed. Vas Cath Inc. v. 
Mahurkar, 19 USPQ 1111,1117 (Fed. Cir. 1991) (cited in MPEP §2163 and in the Examiner 
Guidelines on Written Description Requirement). The Examiner has the initial burden of 
presenting evidence or reasons why persons skilled in the art would not recognize in an 
applicant's disclosure a description of the invention defined by the claims. In re Wertheim, 191 
USPQ 90 (CCPA 1976) (cited in MPEP §2163.04 in the Examiner Guidelines on Written 
Description Requirement). Moreover, it is axiomatic that a patent specification "need not teach, 
and preferably omits, what is well known in the art." See, Spectra-Physics, Inc. v. Coherent, Inc. 
3 USPQ2d 1737, 1743 (Fed. Cir. 1987); Hybritech Inc. v. Monoclonal Antibodies, Inc., 231 
USPQ 81, 94 (Fed. Cir. 1986). Thus, determining whether the written description is satisfied 
requires reading the disclosure in light of the knowledge possessed by those skilled in the art. In 
re Alton, 37 USPQ2d 1578 (Fed. Cir. 1996). 

The written description requirement does not necessitate the description of every species 

falling within the purview of a claimed genus. Further, satisfaction of the written description 

requirement does not require applicants to provide experimental data. Rather, the purpose of the 

written description requirement of 35 U.S.C. §112, first paragraph is to ensure that applicants 

were in possession of the claimed invention at the time of filing. Vas Cath Inc. v. Mahurkar, 19 

USPQ 1111,1117 (Fed. Cir. 1991) (cited in MPEP § 2163). Accordingly, the PTO Revised 

Examiner Guidelines on Written Description states: 

Prior to determining whether the disclosure satisfies the written 
description requirement for the claimed subject matter, the 
examiner should review the claims and the entire specification, 
including the specific embodiments, figures, and sequence listings, 
to understand what applicant has identified as the essential 
distinguishing characteristics of the invention. ... i.e., what the 
applicant has demonstrated possession of, and what applicant has 
claimed. 

* * * 

The written description requirement for a claimed genus may be 
satisfied through sufficient description of a representative number 
of species by actual reduction to practice ... or by disclosure of 
relevant identifying characteristics, i.e., structure or other physical 
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and/or chemical properties, by functional characteristics coupled 
with a known or disclosed correlation between function and 
structure, or by a combination of such identifying characteristics, 
sufficient to show the applicant was in possession of the claimed 
genus... A 'representative number of species means that the species 
which are adequately described are representative of the entire 
genus... Satisfactory disclosure of a 'representative number' 
depends on whether one of skill in the art would recognize that the 
applicant was in possession of the necessary common attributes or 
features of the elements possessed by the members of the genus in 
view of the species disclosed (64 Fed. Reg. 71427, emphasis 
added) 

The burden is on the Examiner to provide evidence as to why a skilled artisan would not 
have recognized that the applicants were in possession of claimed invention at the time of filing. 

Applying these tenets, applicants submit that the Office has failed to carry its burden and 
that the present claims indeed comply with the written description requirement of 35 U.S.C. 
§ 1 12, first paragraph. A review of the application as a whole , coupled with the knowledge in the 
art at the time of filing, evidences that the application is more than sufficient to convey with 
reasonable clarity to those skilled in the art that, as of the filing date sought, they were in 
possession of the invention. 

First, the written description requirement for a claimed genus may be satisfied through 
sufficient description of a representative number of species by actual reduction to practice. 
Contrary to the Office's position, applicants have indeed pointed to a number of specific DNA 
constructs and nucleic acid sequences (i.e., species) falling within the scope of the generic 
claims. Applicants call out specific DNA constructs, for example, polynucleotide open reading 
frames (ORFs) that were cloned into expression vectors such as pGEX, pTCR, pET, pGEX-His 
(Specification, page 69, lines 23-26, emphasis added). Indeed, "Table II - Summary of cloning, 
expression and purification" indicate that over 40 ORFs that contain the sequences of the 
claimed SEQ ID NOs were cloned into DNA constructs (Specification, Table II at pg 74-76). 
Accordingly, applicants have explicitly disclosed a large number of species falling within the 
generic claims. 

Secondly, applicants have explicitly identified the essential distinguishing characteristics 
of the invention. At page 4, lines 14-15, applicants state the "invention provides nucleic acids 
comprising the Neisserial nucleotide sequences disclosed in the examples". The application 
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discloses 106 Examples and hundreds of polynucleotide sequences . Further, at page 5, lines 8-9 
"the invention provides vectors comprising the nucleotide sequences of the invention (e.g. 
expression vectors). . ." . 

Based on the foregoing, there can be no doubt that applicants have identified a large 
number of species falling within the generic claims and identified the essential distinguishing 
characteristics of the invention. Therefore applicants have demonstrated possession of the 
claimed invention as set forth in the above Guidelines. Nevertheless, in order to hasten 
prosecution, applicants' claims now include the recitation of an open reading frame. There is 
extensive support throughout the specification for ORFs containing the polynucleotides of the 
present invention including Table II at pg 74-76 and the 106 Examples. Hence, the Office's 
rejection of the claims under 35 U.S.C. §112, first paragraph has been overcome and withdrawal 
thereof is respectfully requested. 

The Examiner further rejected claims 12 and 13 under 35 U.S.C.§1 12, first paragraph, 
asserting that "the specification is not deemed to provide reasonable support to one of ordinary 
skill in the art that the biochemical activity of a nucleic acid at least 50% but less than 100% 
identical to the entire length of the elected sequences would be the same" (Office Action, page 
6). Applicants respectfully disagree. Nevertheless, in order to advance prosecution, applicants 
have amended claim 12 to raise the percent identity to "90%." Claim 9 as amended, also recites 
the percent identity of 90%, and claim 13 now depends from claim 9. Support for amended 
claims 9 and 12 is found in the specification at page 8, line 4. 

In view of the above arguments and amendments, the applicants submit that the pending 
claims reasonably convey the claimed invention to one of ordinary skill in the art. Accordingly, 
the rejection of the claims under 35 U.S.C 1 12, first paragraph first paragraph has been 
overcome and withdrawal thereof is respectfully requested. 

Claim Rejections under 35 U.S.C. §102 

Claims 10, 11, 12 and 13 were rejected under 35 U.S.C. §102(b) or(e) as anticipated by 
the various GenEmbl sequences or U.S. patents listed in the table on page 10 of the Office 
Action under reply. Specifically, the Examiner has rejected claim 10, and dependent claims 1 1, 
12 and 13 asserting that the claims are antipated by the disclosure of various database sequences 
and US patents that comprise a fragment of at least 10 base pairs of sequences of the elected 
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SEQ ID NOs, as required by claim 10 and as defined in the specification for the term "fragment" 
(page 4). 

In order to facilitate prosecution, claim 10 has been amended to recite "a fragment greater 
than 18 nucleotides in length . . .". Support for amended claim 10 is found in the specification at 
page 4, lines 3-4. None of the database sequences or U.S. patents cited disclose a sequence 
greater than 18 nucleotides that is identical to SEQ ID NO: 125, SEQ ID NO: 127, SEQ ID NO: 
131, SEQ ID NO: 463, SEQ ID NO: 465 SEQ ID NO: 569, or SEQ ID NO: 571. To further 
advance prosecution, SEQ ID NO: 651, SEQ ID NO: 649, SEQ ID NO: 653 have been deleted 
from claim 10. 

In view of the above amendments and arguments, the cited reference sequences cannot be 
said to teach all the elements of the present invention. Accordingly, withdrawal of the above 
rejections is respectfully requested. 

The Examiner also rejected claims 8-13 under 35 USC 102(b) as being anticipated by 
Paruchuri et al. (PNAS, USA, Vo. 87, No. 1, pages 333-3337, 1990). The Examiner states that 
Paruchuri et al. describes the isolation of chromosomal DNA from wild-type Niesseria 
gonorrhea. The Examiner further alleges: "since the nucleic acid sequences of the elected SEQ 
ID NOs are from Niesseria, it is inherent that the nucleic acid molecules i.e. the Neisserial 
chromosomal DNA, disclosed by Paruchuri et al. encode the proteins encoded by the nucleic 
acid sequences of the elected SEQ ID NOs, as required in claim 8 (Office Action, page 1 1). 
Additionally, the Examiner argues the DNA molecules disclosed in Paruchuri et al comprise the 
nucleotide sequences of the elected SEQ ID NOs, and fragments thereof, as recited in claims 9 
and 10. The Examiner further contends that DNA molecules disclosed in Paruchuri et al. 
inherently comprise the sequences complementary to the sequences of the elected SEQ ID NOs, 
or fragments thereof, as specified in claim 11. The Examiner further alleges the DNA molecules 
disclosed in Paruchuri et al. inherently comprise sequences that are at least 50% identical to the 
sequences of the elected SEQ ID NOs, or fragments thereof as recited in claim 12. Lastly, the 
Examiner argues that DNA molecules disclosed in Paruchuri et al can inherently hybridize to 
the sequences of the elected SEQ ID NOs, or fragments thereof, or complements thereof, as 
specified in claim 13. Applicants respectfully disagree that Paruchuri et al. anticipates claims 9- 
13. The Examiner's rejection of claim 8 is moot. 
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A claim is anticipated only if each and every element as set forth in the claim is found, 
either expressly or inherently described in a single prior art reference. Verdegaal Bros. v. Union 
Oil of California, 814 F.2d 628, 631 (Fed.Cir. 1987); See also MPEP § 2131. As the Examiner 
correctly stated, Paruchuri et al. discloses the isolation of the chromosomal DNA from wild-type 
Niesseria gonorrhea (Office action at page 9). Paruchuri et al did not determine the nucleotide 
sequence of any part of the Niesseria gonorrhea chromosome. There is nothing to suggest 
Paruchuri et al. contains any teachings of the identical chemical structure of the isolated 
polynucleotides in claims 9-13. 

Paruchuri et al does not isolate polynucleotides as claimed. Moreover, reliance on 
Paruchuri et al is misplaced and applicants respectfully disagree. Inherency, may not be 
established by probabilities or possibilities . The mere fact that a certain characteristic may be 
present in the prior art is not sufficient to establish inherency of that characteristic. Scaltech Inc. 
v. Retec/Tetra LLC 156 F. 3d 1 193 (Fed Cir. 1999). To establish inherency, the extrinsic 
evidence "must make clear that the missing descriptive matter is necessarily present in the thing 
described in the reference, and that it would be so recognized by persons of ordinary skill in the 
art." In re Robertson, 49 USPQ2d 1949, 1950-51 (Fed. Cir. 1999), quoting Continental Can v. 
Monsanto Co., 948 F.2d 1264, 1268, 20 USPQ2d 1746, 1749 (Fed. Cir. 1991). 

Thus, the fact that polynucleotide sequences may be present in the chromosomal DNA 
from Niesseria is not sufficient to establish inherency of the claimed isolated polynucleotide 
sequences. The Examiner fails to establish inherency because Paruchuri et al "must make clear 
that the missing descriptive matter [the isolated polynucleotide sequences as defined by the SEQ 
ID NOs] is necessarily present in the thing described in the reference [chromosomal DNA], and 
that it would be so recognized by persons of ordinary skill in the art." The Examiner has 
provided no evidence that those of ordinary skill in the art would have recognized the chemical 
structure of the particular isolated polynucleotide sequences claimed in claims 9-13, from the 
mere disclosure of chromosomal DNA. Further, the Examiner's reliance on In re Best and In re 
Fitzgerald is misplaced and applicants respectfully disagree because the Examiner "must provide 
a basis in fact and or/technical reasoning to reasonably support the determination that the alleged 
inherit characteristic necessarily flows from the teaching of the applied prior art". Ex parte Levy 
17 USPQ2d 1461 (Bd. Pat. App. & Inter. 1990) (emphasis in original). Here the Examiner has 
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failed to provide objective evidence or cogent technical reasoning to support the conclusion of 
inherency. 

Thus, the Examiner's reliance on Paruchuri et aL is insufficient as a matter of law to meet 
the requirements of inherency. The Paruchuri et aL reference does not inherently disclose the 
isolated polynucleotides in claims 9-13. Accordingly, withdrawal of the rejections under 35 
U.S.C. §102 (b) is respectfully requested. 

Claim Objections 

The Examiner objected to claims 8, 9 and dependent claims 11-13, asserting that these 
claims do not reflect the elected subject matter. Applicants elected 10 polynucleotide sequences 
or fragments thereof. Claim 8 depends from any one of the claims 4-6, which are drawn to 
polypeptides. Applicants have canceled claim 8 and amended claims 9, and 1 1-13 to read on the 
elected polynucleotide SEQ ID NOs as requested by the examiner. Accordingly, this basis for 
objection has been overcome. 

The Examiner objected to claim 9 because it ends with two periods. Claim 9 has been 
amended to correct the typographical error. Thus, this basis for objection has also been 
overcome. 



CONCLUSION 

Applicants respectfully submit that the claims are novel and nonobvious and comply with 
the requirements of 35 U.S.C. 112. Accordingly, allowance is believed to be in order an early 
notification to that effect is respectfully requested 

Please direct all further written communications regarding this application to 

Alisa Harbin, Esq. 
Intellectual Property -R440 
P.O Box 8097 

Emeryville, CA 94662-8097 



13 



Dated: 



Chiron Corporation 

ATTN: Intellectual Property -R440 

P.O Box 8097 

Emeryville, CA 94662-8097 

Tel: (650) 843-5000 Fax: (650) 857-0663 



By: 



Respectfully submitted, 

COOLEY GODWARD LLP 



Roberta L. Robins 
Reg. No. 33,208 
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NEISSERIAL [ANTIGENS] POLYNUCLEOTIDES 

This application is a continuation-in-part of international patent application PCT/IB98/01665, filed 
October 9, 1998, from which priority is claimed under 35 U.S.C. § 1 19. 

This invention relates to antigens from Neisseria bacteria. 

5 BACKGROUND ART 

Neisseria meningitidis and Neisseria gonorrhoeae are non-motile, gram negative diplococci that 
are pathogenic in humans. N. meningitidis colonises the pharynx and causes meningitis (and, 
occasionally, septicaemia in the absence of meningitis); N .gonorrhoeae colonises the genital tract 
and causes gonorrhea. Although colonising different areas of the body and causing completely 
10 different diseases, the two pathogens are closely related, although one feature that clearly 
differentiates meningococcus from gonococcus is the presence of a polysaccharide capsule that is 
present in all pathogenic meningococci. 

N gonorrhoeae caused approximately 800,000 cases per year during the period 1983-1990 in the 

United States alone (chapter by Meitzner & Cohen, "Vaccines Against Gonococcal Infection", In: 
15 New Generation Vaccines, 2nd edition, ed. Levine, Woodrow, Kaper, & Cobon, Marcel Dekker, 

New York, 1997, pp.8 17-842). The disease causes significant morbidity but limited mortality. 

Vaccination against N gonorrhoeae would be highly desirable, but repeated attempts have failed. 

The main candidate antigens for this vaccine are surface-exposed proteins such as pili, porins, 

opacity-associated proteins (Opas) and other surface-exposed proteins such as the Lip, Laz, IgAl 
20 protease and transferrin-binding proteins. The lipooligosaccharide (LOS) has also been suggested 

as vaccine (Meitzner & Cohen, supra). 

N meningitidis causes both endemic and epidemic disease. In the United States the attack rate is 
0.6-1 per 100,000 persons per year, and it can be much greater during outbreaks (see Lieberman et 
al (1996) Safety and Immunogenicity of a Serogroups A/C Neisseria meningitidis 
25 Oligosaccharide-Protein Conjugate Vaccine in Young Children. JAMA 275(1 9): 1499-1 503; 
Schuchat et al (1997) Bacterial Meningitis in the United States in 1995. N Engl J Med 
337(1 4):970-976). In developing countries, endemic disease rates are much higher and during 
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epidemics incidence rates can reach 500 cases per 100,000 persons per year. Mortality is extremely 
high, at 10-20% in the United States, and much higher in developing countries. Following the 
introduction of the conjugate vaccine against Haemophilus influenzae, N. meningitidis is the major 
cause of bacterial meningitis at all ages in the United States (Schuchat et al (1997) supra). 

5 Based on the organism's capsular polysaccharide, 12 serogroups of N. meningitidis have been 
identified. Group A is the pathogen most often implicated in epidemic disease in sub-Saharan 
Africa. Serogroups B and C are responsible for the vast majority of cases in the United States and 
in most developed countries. Serogroups W135 and Y are responsible for the rest of the cases in 
the United States and developed countries. The meningococcal vaccine currently in use is a 

10 tetravalent polysaccharide vaccine composed of serogroups A, C, Y and W135. Although 
efficacious in adolescents and adults, it induces a poor immune response and short duration of 
protection, and cannot be used in infants [eg. Morbidity and Mortality weekly report, Vol.46, No. 
RR-5 (1997)]. This is because polysaccharides are T-cell independent antigens that induce a weak 
immune response that cannot be boosted by repeated immunization. Following the success of the 

15 vaccination against H.influenzae, conjugate vaccines against serogroups A and C have been 
developed and are at the final stage of clinical testing (Zollinger WD "New and Improved Vaccines 
Against Meningococcal Disease" in: New Generation Vaccines, supra, pp. 469-488; Lieberman et 
al (1996) supra; Costantino et al (1992) Development and phase I clinical testing of a conjugate 
vaccine against meningococcus A and C. Vaccine 10:691-698). 

20 Meningococcus B remains a problem, however. This serotype currently is responsible for 
approximately 50% of total meningitis in the United States, Europe, and South America. The 
polysaccharide approach cannot be used because the menB capsular polysaccharide is a polymer of 
a(2-8)-linked TV-acetyl neuraminic acid that is also present in mammalian tissue. This results in 
tolerance to the antigen; indeed, if an immune response were elicited, it would be anti-self, and 

25 therefore undesirable. In order to avoid induction of autoimmunity and to induce a protective 
immune response, the capsular polysaccharide has, for instance, been chemically modified 
substituting the Af-acetyl groups with N-propionyl groups, leaving the specific antigenicity 
unaltered (Romero & Outschoorn (1994) Current status of Meningococcal group B vaccine 
candidates: capsular or non-capsulart?]., Clin Microbiol Rev 7(4):559-575). 
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Alternative approaches to menB vaccines have used complex mixtures of outer membrane proteins 
(OMPs), containing either the OMPs alone, or OMPs enriched in porins, or deleted of the class 4 
OMPs that are believed to induce antibodies that block bactericidal activity. This approach 
produces vaccines that are not well characterized. They are able to protect against the homologous 
5 strain, but are not effective at large where there are many antigenic variants of the outer membrane 
proteins. To overcome the antigenic variability, multivalent vaccines containing up to nine 
different porins have been constructed (eg. Poolman JT (1992) Development of a meningococcal 
vaccine. Infect. Agents Dis. 4:13-28). Additional proteins to be used in outer membrane vaccines 
have been the opa and opc proteins, but none of these approaches have been able to overcome the 
10 antigenic variability (eg. Ala'Aldeen & Borriello (1996) The meningococcal transferrin-binding 
proteins 1 and 2 are both surface exposed and generate bactericidal antibodies capable of killing 
homologous and heterologous strains. Vaccine 14(l):49-53). 

A certain amount of sequence data is available for meningococcal and gonoccocal genes and 
proteins (eg. EP-A-0467714, W096/29412), but this is by no means complete. The provision of 
15 further sequences could provide an opportunity to identify secreted or surface-exposed proteins that 
are presumed targets for the immune system and which are not antigenically variable. For instance, 
some of the identified proteins could be components of efficacious vaccines against 
meningococcus B, some could be components of vaccines against all meningococcal serotypes, and 
others could be components of vaccines against all pathogenic Neisserias 

20 THE INVENTION 

The invention provides proteins comprising the Neisserial amino acid sequences disclosed in the 
examples. These sequences relate to N. meningitidis or N. gonorrhoeae. 

It also provides proteins comprising sequences homologous (ie. having sequence identity) to the 
Neisserial amino acid sequences disclosed in the examples. Depending on the particular sequence, 
25 the degree of identity is preferably greater than 50% (eg. 65%, 80%, 90%, or more). These 
homologous proteins include mutants and allelic variants of the sequences disclosed in the 
examples. Typically, 50% identity or more between two proteins is considered to be an indication 
of functional equivalence. Identity between the proteins is preferably determined by the 
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Smith-Waterman homology search algorithm as implemented in the MPSRCH program (Oxford 
Molecular), using an affine gap search with parameters gap open penalty=12 and gap extension 
penalty =1. 

The invention further provides proteins comprising fragments of the Neisseria! amino acid 
5 sequences disclosed in the examples. The fragments should comprise at least n consecutive amino 
acids from the sequences and, depending on the particular sequence, n is 7 or more (eg. 8, 10, 12, 
14, 16, 18, 20 or more). Preferably the fragments comprise an epitope from the sequence. 

The proteins of the invention can, of course, be prepared by various means (eg. recombinant 
expression, purification from cell culture, chemical synthesis etc) and in various forms (eg. native, 
10 fusions ere). They are preferably prepared in substantially pure or isolated form (ie. substantially 
free from other Neisserial or host cell proteins) 

According to a further aspect, the invention provides antibodies which bind to these proteins. These 
may be polyclonal or monoclonal and may be produced by any suitable means. 

According to a further aspect, the invention provides nucleic acid comprising the Neisserial 
15 nucleotide sequences disclosed in the examples. In addition, the invention provides nucleic acid 
comprising sequences homologous (ie. having sequence identity) to the Neisserial nucleotide 
sequences disclosed in the examples. 

Furthermore, the invention provides nucleic acid which can hybridise to the Neisserial nucleic acid 
disclosed in the examples, preferably under "high stringency" conditions (eg. 65°C in a O.lxSSC, 
20 0.5% SDS solution). 

Nucleic acid comprising fragments of these sequences are also provided. These should comprise at 
least n consecutive nucleotides from the Neisserial sequences and, depending on the particular 
sequence, n is 10 or more (eg 12, 14, 15, 18, 20, 25, 30, 35, 40 or more). 

According to a further aspect, the invention provides nucleic acid encoding the proteins and protein 
25 fragments of the invention. 
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It should also be appreciated that the invention provides nucleic acid comprising sequences 
complementary to those described above (eg. for antisense or probing purposes). 

Nucleic acid according to the invention can, of course, be prepared in many ways (eg. by chemical 
synthesis, from genomic or cDNA libraries, from the organism itself etc.) and can take various 
5 forms (eg. single stranded, double stranded, vectors, probes etc.). 

In addition, the term "nucleic acid" includes DNA and RNA, and also their analogues, such as 
those containing modified backbones, and also peptide nucleic acids (PNA) etc. 

According to a further aspect, the invention provides vectors comprising nucleotide sequences of 
the invention (eg. expression vectors) and host cells transformed with such vectors. 

10 According to a further aspect, the invention provides compositions comprising protein, antibody, 
and/or nucleic acid according to the invention. These compositions may be suitable as vaccines, for 
instance, or as diagnostic reagents, or as immunogenic compositions. 

The invention also provides nucleic acid, protein, or antibody according to the invention for use as 
medicaments (eg. as vaccines) or as diagnostic reagents. It also provides the use of nucleic acid, 

15 protein, or antibody according to the invention in the manufacture of: (i) a medicament for treating 
or preventing infection due to Neisserial bacteria; (ii) a diagnostic reagent for detecting the 
presence of Neisserial bacteria or of antibodies raised against Neisserial bacteria; and/or (iii) a 
reagent which can raise antibodies against Neisserial bacteria. Said Neisserial bacteria may be any 
species or strain (such as N. gonorrhoeae, or any strain of N. meningitidis, such as strain A, strain B 

20 or strain C). 

The invention also provides a method of treating a patient, comprising administering to the patient 
a therapeutically effective amount of nucleic acid, protein, and/or antibody according to the 
invention. 

According to further aspects, the invention provides various processes. 

25 A process for producing proteins of the invention is provided, comprising the step of culturing a 
host cell according to the invention under conditions which induce protein expression. 
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A process for producing protein or nucleic acid of the invention is provided, wherein the the 
protein or nucleic acid is synthesised in part or in whole using chemical means. 

A process for detecting polynucleotides of the invention is provided, comprising the steps of: (a) 
contacting a nucleic probe according to the invention with a biological sample under hybridizing 
5 conditions to form duplexes; and (b) detecting said duplexes. 

A process for detecting proteins of the invention is provided, comprising the steps of: (a) 
contacting an antibody according to the invention with a biological sample under conditions 
suitable for the formation of an antibody-antigen complexes; and (b) detecting said complexes. 

A summary of standard techniques and procedures which may be employed in order to perform the 
10 invention (eg. to utilise the disclosed sequences for vaccination or diagnostic purposes) follows. 
This summary is not a limitation on the invention but, rather, gives examples that may be used, but 
are not required. 

General 

The practice of the present invention will employ, unless otherwise indicated, conventional 
15 techniques of molecular biology, microbiology, recombinant DNA, and immunology, which are 
within the skill of the art. Such techniques are explained fully in the literature eg. Sambrook 
Molecular Cloning; A Laboratory Manual, Second Edition (1989); DNA Cloning, Volumes I and ii 
(D.N Glover ed. 1985); Oligonucleotide Synthesis (MJ. Gait ed, 1984); Nucleic Acid Hybridization 
(B.D. Hames & S.J. Higgins eds. 1984); Transcription and Translation (B.D. Hames & SJ. 
20 Higgins eds. 1984); Animal Cell Culture (R.L Freshney ed. 1986); Immobilized Cells and Enzymes 
(IRL Press, 1986); B. Perbal, A Practical Guide to Molecular Cloning (1984); the Methods in 
Enzymology series (Academic Press, Inc.), especially volumes 154 & 155; Gene Transfer Vectors 
for Mammalian Cells (J.H. Miller and M.P. Calos eds. 1987, Cold Spring Harbor Laboratory); 
Mayer and Walker, eds. (1987), Immunochemical Methods in Cell and Molecular Biology 
25 (Academic Press, London); Scopes, (1987) Protein Purification: Principles and Practice, Second 
Edition (Springer- Verlag, N. Y.), and Handbook of Experimental Immunology, Volumes I-IV (D.M. 
Weir and C. C. Blackwell eds 1986). 
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Standard abbreviations for nucleotides and amino acids are used in this specification. 

All publications, patents, and patent applications cited herein are incorporated in full by reference. 
In particular, the contents of UK patent applications 9723516.2, 9724190.5, 9724386.9, 9725158.1, 
9726147.3, 9800759.4, and 9819016.8 are incorporated herein. 

5 Definitions 

A composition containing X is "substantially free of Y when at least 85% by weight of the total 
X+Y in the composition is X. Preferably, X comprises at least about 90% by weight of the total of 
X+Y in the composition, more preferably at least about 95% or even 99% by weight. 

The term "comprising" means "including" as well as "consisting" eg, a composition "comprising" 
10 X may consist exclusively of X or may include something additional to X, such as X+Y. 

A "conserved" Neisseria amino acid fragment or protein is one that is present in a particular 
Neisserial protein in at least x% of Neisseria. The value of x may be 50% or more, e.g., 66%, 
75%, 80%, 90%, 95% or even 100% (i.e. the amino acid is found in the protein in question in all 
Neisseria). In order to determine whether an animo acid is "conserved" in a particular Neisserial 

15 protein, it is necessary to compare that amino acid residue in the sequences of the protein in 
question from a plurality of different Neisseria (a reference population). The reference population 
may include a number of different Neisseria species or may include a single species. The reference 
population may include a number of different serogroups of a particular species or a single 
serogroup. A preferred reference population consists of the 5 most common NeisseriaThc term 

20 "heterologous" refers to two biological components that are not found together in nature. The 
components may be host cells, genes, or regulatory regions, such as promoters. Although the 
heterologous components are not found together in nature, they can function together, as when a 
promoter heterologous to a gene is operably linked to the gene. Another example is where a 
Neisserial sequence is heterologous to a mouse host cell. A further examples would be two 

25 epitopes from the same or different proteins which have been assembled in a single protein in an 
arrangement not found in nature. 
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An "origin of replication" is a polynucleotide sequence that initiates and regulates replication of 
polynucleotides, such as an expression vector. The origin of replication behaves as an autonomous 
unit of polynucleotide replication within a cell, capable of replication under its own control. An 
origin of replication may be needed for a vector to replicate in a particular host cell. With certain 
5 origins of replication, an expression vector can be reproduced at a high copy number in the 
presence of the appropriate proteins within the cell. Examples of origins are the autonomously 
replicating sequences, which are effective in yeast; and the viral T-antigen, effective in COS-7 
cells. 

A "mutant" sequence is defined as DNA, RNA or amino acid sequence differing from but having 
sequence identity with the native or disclosed sequence. Depending on the particular sequence, the 
degree of sequence identity between the native or disclosed sequence and the mutant sequence is 
preferably greater than 50% (eg. 60%, 70%, 80%, 90%, 95%, 99% or more, calculated using the 
Smith-Waterman algorithm as described above). As used herein, an "allelic variant" of a nucleic 
acid molecule, or region, for which nucleic acid sequence is provided herein is a nucleic acid 
molecule, or region, that occurs essentially at the same locus in the genome of another or second 
isolate, and that, due to natural variation caused by, for example, mutation or recombination, has a 
similar but not identical nucleic acid sequence. A coding region allelic variant typically encodes a 
protein having similar activity to that of the protein encoded by the gene to which it is being 
compared. An allelic variant can also comprise an alteration in the 5' or 3' untranslated regions of 
the gene, such as in regulatory control regions (eg. see US patent 5,753,235). 

Expression systems 

The Neisserial nucleotide sequences can be expressed in a variety of different expression systems; 
for example those used with mammalian cells, baculoviruses, plants, bacteria, and yeast. 

i. Mammalian Systems 

25 Mammalian expression systems are known in the art. A mammalian promoter is any DNA 
sequence capable of binding mammalian RNA polymerase and initiating the downstream (3') 
transcription of a coding sequence (eg. structural gene) into mRNA. A promoter will have a 
transcription initiating region, which is usually placed proximal to the 5' end of the coding 
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sequence, and a TATA box, usually located 25-30 base pairs (bp) upstream of the transcription 
initiation site. The TATA box is thought to direct RNA polymerase II to begin RNA synthesis at 
the correct site. A mammalian promoter will also contain an upstream promoter element, usually 
located within 100 to 200 bp upstream of the TATA box. An upstream promoter element 
5 determines the rate at which transcription is initiated and can act in either orientation [Sambrook et 
al. (1989) "Expression of Cloned Genes in Mammalian Cells." In Molecular Cloning: A 
Laboratory Manual, 2nd ed.]. 

Mammalian viral genes are often highly expressed and have a broad host range; therefore 
sequences encoding mammalian viral genes provide particularly useful promoter sequences. 
10 Examples include the SV40 early promoter, mouse mammary tumor virus LTR promoter, 
adenovirus major late promoter (Ad MLP), and herpes simplex virus promoter. In addition, 
sequences derived from non-viral genes, such as the murine metallotheionein gene, also provide 
useful promoter sequences. Expression may be either constitutive or regulated (inducible), 
depending on the promoter can be induced with glucocorticoid in hormone-responsive cells. 

The presence of an enhancer element (enhancer), combined with the promoter elements described 
above, will usually increase expression levels. An enhancer is a regulatory DNA sequence that can 
stimulate transcription up to 1000-fold when linked to homologous or heterologous promoters, 
with synthesis beginning at the normal RNA start site. Enhancers are also active when they are 
placed upstream or downstream from the transcription initiation site, in either normal or flipped 
orientation, or at a distance of more than 1000 nucleotides from the promoter [Maniatis et al. 
(1987) Science 236:1237; Alberts et al. (1989) Molecular Biology of the Cell, 2nd ed.]. Enhancer 
elements derived from viruses may be particularly useful, because they usually have a broader host 
range. Examples include the SV40 early gene enhancer [Dijkema et al (1985) EMBO 7. 4:761] and 
the enhancer/promoters derived from the long terminal repeat (LTR) of the Rous Sarcoma Virus 
[Gorman et al. (1982b) Proa Natl. Acad. ScL 79:6777] and from human cytomegalovirus [Boshart 
et al. (1985) Cell 47:521]. Additionally, some enhancers are regulatable and become active only in 
the presence of an inducer, such as a hormone or metal ion [Sassone-Corsi and Borelli (1986) 
Trends Genet. 2:215; Maniatis et al. (1987) Science 236:1237]. 
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A DNA molecule may be expressed intracellularly in mammalian cells. A promoter sequence may 
be directly linked with the DNA molecule, in which case the first amino acid at the N-terminus of 
the recombinant protein will always be a methionine, which is encoded by the ATG start codon. If 
desired, the N-terminus may be cleaved from the protein by in vitro incubation with cyanogen 
5 bromide. 

Alternatively, foreign proteins can also be secreted from the cell into the growth media by creating 
chimeric DNA molecules that encode a fusion protein comprised of a leader sequence fragment 
that provides for secretion of the foreign protein in mammalian cells. Preferably, there are 
processing sites encoded between the leader fragment and the foreign gene that can be cleaved 
10 either in vivo or in vitro. The leader sequence fragment usually encodes a signal peptide comprised 
of hydrophobic amino acids which direct the secretion of the protein from the cell. The adenovirus 
triparite leader is an example of a leader sequence that provides for secretion of a foreign protein in 
mammalian cells. 

Usually, transcription termination and polyadenylation sequences recognized by mammalian cells 
15 are regulatory regions located 3' to the translation stop codon and thus, together with the promoter 
elements, flank the coding sequence. The 3' terminus of the mature mRNA is formed by site- 
specific post-transcriptional cleavage and polyadenylation [Birnstiel et al. (1985) Cell 47:349; 
Proudfoot and Whitelaw (1988) "Termination and 3' end processing of eukaryotic RNA. In 
Transcription and splicing (ed. B.D. Hames and D.M. Glover); Proudfoot (1989) Trends Biochem. 
20 ScL 74:105]. These sequences direct the transcription of an mRNA which can be translated into the 
polypeptide encoded by the DNA. Examples of transcription terminater/polyadenylation signals 
include those derived from SV40 [Sambrook et al (1989) "Expression of cloned genes in cultured 
mammalian cells." In Molecular Cloning: A Laboratory Manual], 

Usually, the above described components, comprising a promoter, polyadenylation signal, and 
25 transcription termination sequence are put together into expression constructs. Enhancers, introns 
with functional splice donor and acceptor sites, and leader sequences may also be included in an 
expression construct, if desired. Expression constructs are often maintained in a replicon, such as 
an extrachromosomal element (eg. plasmids) capable of stable maintenance in a host, such as 
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mammalian cells or bacteria. Mammalian replication systems include those derived from animal 
viruses, which require trans-acting factors to replicate. For example, plasmids containing the 
replication systems of papovaviruses, such as SV40 [Gluzman (1981) Cell 23:175] or 
polyomavirus, replicate to extremely high copy number in the presence of the appropriate viral T 
5 antigen. Additional examples of mammalian replicons include those derived from bovine 
papillomavirus and Epstein-Barr virus. Additionally, the replicon may have two replicaton systems, 
thus allowing it to be maintained, for example, in mammalian cells for expression and in a 
prokaryotic host for cloning and amplification. Examples of such mammalian-bacteria shuttle 
vectors include pMT2 [Kaufman et al. (1989) Mol Cell Biol 9:946] and pHEBO [Shimizu et al. 
10 (1 986) Mol Cell Biol 6: 1 074] . 

The transformation procedure used depends upon the host to be transformed. Methods for 
introduction of heterologous polynucleotides into mammalian cells are known in the art and 
include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated 
transfection, protoplast fusion, electroporation, encapsulation of the polynucleotide(s) in 
15 liposomes, and direct microinjection of the DNA into nuclei. 

Mammalian cell lines available as hosts for expression are known in the art and include many 
immortalized cell lines available from the American Type Culture Collection (ATCC), including 
but not limited to, Chinese hamster ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) 
cells, monkey kidney cells (COS), human hepatocellular carcinoma cells (eg. Hep G2), and a 
20 number of other cell lines. 

ii. Baculovirus Systems 

The polynucleotide encoding the protein can also be inserted into a suitable insect expression 
vector, and is operably linked to the control elements within that vector. Vector construction 
employs techniques which are known in the art. Generally, the components of the expression 
25 system include a transfer vector, usually a bacterial plasmid, which contains both a fragment of the 
baculovirus genome, and a convenient restriction site for insertion of the heterologous gene or 
genes to be expressed; a wild type baculovirus with a sequence homologous to the baculovirus- 
specific fragment in the transfer vector (this allows for the homologous recombination of the 
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heterologous gene in to the baculovirus genome); and appropriate insect host cells and growth 
media. 

After inserting the DNA sequence encoding the protein into the transfer vector, the vector and the 
wild type viral genome are transfected into an insect host cell where the vector and viral genome 
5 are allowed to recombine. The packaged recombinant virus is expressed and recombinant plaques 
are identified and purified. Materials and methods for baculovirus/insect cell expression systems 
are commercially available in kit form from, inter alia, Invitrogen, San Diego CA ("MaxBac" kit). 
These techniques are generally known to those skilled in the art and fully described in Summers 
and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987) (hereinafter "Summers 
10 and Smith"). 

Prior to inserting the DNA sequence encoding the protein into the baculovirus genome, the above 
described components, comprising a promoter, leader (if desired), coding sequence of interest, and 
transcription termination sequence, are usually assembled into an intermediate transplacement 
construct (transfer vector). This construct may contain a single gene and operably linked regulatory 
15 elements; multiple genes, each with its owned set of operably linked regulatory elements; or 
multiple genes, regulated by the same set of regulatory elements. Intermediate transplacement 
constructs are often maintained in a replicon, such as an extrachromosomal element (eg, plasmids) 
capable of stable maintenance in a host, such as a bacterium. The replicon will have a replication 
system, thus allowing it to be maintained in a suitable host for cloning and amplification. 

20 Currently, the most commonly used transfer vector for introducing foreign genes into AcNPV is 
pAc373. Many other vectors, known to those of skill in the art, have also been designed. These 
include, for example, pVL985 (which alters the polyhedrin start codon from ATG to ATT, and 
which introduces a BamHI cloning site 32 basepairs downstream from the ATT; see Luckow and 
Summers, Virology (]9S9) 77:31. 

25 The plasmid usually also contains the polyhedrin polyadenylation signal (Miller et al. (1988) Ann. 
Rev. Microbiol., 42:177) and a prokaryotic ampicillin-resistance (amp) gene and origin of 
replication for selection and propagation in E. coli. 
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Baculovirus transfer vectors usually contain a baculovirus promoter. A baculovirus promoter is any 
DNA sequence capable of binding a baculovirus RNA polymerase and initiating the downstream 
(5' to 3') transcription of a coding sequence (eg. structural gene) into mRNA. A promoter will have 
a transcription initiation region which is usually placed proximal to the 5' end of the coding 
5 sequence. This transcription initiation region usually includes an RNA polymerase binding site and 
a transcription initiation site. A baculovirus transfer vector may also have a second domain called 
an enhancer, which, if present, is usually distal to the structural gene. Expression may be either 
regulated or constitutive. 

Structural genes, abundantly transcribed at late times in a viral infection cycle, provide particularly 
10 useful promoter sequences. Examples include sequences derived from the gene encoding the viral 
polyhedron protein, Friesen et al., (1986) "The Regulation of Baculovirus Gene Expression," in: 
The Molecular Biology of Baculoviruses (ed. Walter Doerfler); EPO Publ. Nos. 127 839 and 155 
476; and the gene encoding the plO protein, Vlak et al., (1988), J. Gen. Virol 69:765. 

DNA encoding suitable signal sequences can be derived from genes for secreted insect or 
15 baculovirus proteins, such as the baculovirus polyhedrin gene (Carbonell et al. (1988) Gene, 
73:409). Alternatively, since the signals for mammalian cell posttranslational modifications (such 
as signal peptide cleavage, proteolytic cleavage, and phosphorylation) appear to be recognized by 
insect cells, and the signals required for secretion and nuclear accumulation also appear to be 
conserved between the invertebrate cells and vertebrate cells, leaders of non-insect origin, such as 
20 those derived from genes encoding human a-interferon, Maeda et al., (1985), Nature 375:592; 
human gastrin-releasing peptide, Lebacq-Verheyden et al., (1988), Molec. Cell Biol S:3129; 
human IL-2, Smith et al., (1985) Proc. Natl Acad. Sci. USA, 52:8404; mouse IL-3, (Miyajima et 
al., (1987) Gene 58:213; and human glucocerebrosidase, Martin et al. (1988) DNA, 7:99, can also 
be used to provide for secretion in insects. 

25 A recombinant polypeptide or polyprotein may be expressed intracellular^ or, if it is expressed 
with the proper regulatory sequences, it can be secreted. Good intracellular expression of nonfused 
foreign proteins usually requires heterologous genes that ideally have a short leader sequence 
containing suitable translation initiation signals preceding an ATG start signal. If desired, 
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methionine at the N-terminus may be cleaved from the mature protein by in vitro incubation with 
cyanogen bromide. 

Alternatively, recombinant polyproteins or proteins which are not naturally secreted can be 
secreted from the insect cell by creating chimeric DNA molecules that encode a fusion protein 
5 comprised of a leader sequence fragment that provides for secretion of the foreign protein in 
insects. The leader sequence fragment usually encodes a signal peptide comprised of hydrophobic 
amino acids which direct the translocation of the protein into the endoplasmic reticulum. 

After insertion of the DNA sequence and/or the gene encoding the expression product precursor of 
the protein, an insect cell host is co-transformed with the heterologous DNA of the transfer vector 

10 and the genomic DNA of wild type baculovirus - usually by co-transfection. The promoter and 
transcription termination sequence of the construct will usually comprise a 2-5kb section of the 
baculovirus genome. Methods for introducing heterologous DNA into the desired site in the 
baculovirus virus are known in the art. (See Summers and Smith supra; Ju et al. (1987); Smith et 
al., Mol Cell Biol (1983) 5:2156; and Luckow and Summers (1989)). For example, the insertion 

15 can be into a gene such as the polyhedrin gene, by homologous double crossover recombination; 
insertion can also be into a restriction enzyme site engineered into the desired baculovirus gene. 
Miller et al., (1989), Bioessays 4:91. The DNA sequence, when cloned in place of the polyhedrin 
gene in the expression vector, is flanked both 5' and 3' by polyhedrin-specific sequences and is 
positioned downstream of the polyhedrin promoter. 

20 The newly formed baculovirus expression vector is subsequently packaged into an infectious 
recombinant baculovirus. Homologous recombination occurs at low frequency (between about 1 % 
and about 5%); thus, the majority of the virus produced after cotransfection is still wild-type virus. 
Therefore, a method is necessary to identify recombinant viruses. An advantage of the expression 
system is a visual screen allowing recombinant viruses to be distinguished. The polyhedrin protein, 

25 which is produced by the native virus, is produced at very high levels in the nuclei of infected cells 
at late times after viral infection. Accumulated polyhedrin protein forms occlusion bodies that also 
contain embedded particles. These occlusion bodies, up to 15 Dm in size, are highly refractile, 
giving them a bright shiny appearance that is readily visualized under the light microscope. Cells 



CHIR-0160 (356.001) PATENT 

-15- 

infected with recombinant viruses lack occlusion bodies. To distinguish recombinant virus from 
wild-type virus, the transfection supernatant is plaqued onto a monolayer of insect cells by 
techniques known to those skilled in the art. Namely, the plaques are screened under the light 
microscope for the presence (indicative of wild-type virus) or absence (indicative of recombinant 
5 virus) of occlusion bodies. "Current Protocols in Microbiology" Vol. 2 (Ausubel et al. eds) at 16.8 
(Supp. 10, 1990); Summers and Smith, supra\ Miller et al. (1989). 

Recombinant baculovirus expression vectors have been developed for infection into several insect 
cells. For example, recombinant baculoviruses have been developed for, inter alia: Aedes aegypti , 
Autographa californica, Bombyx mori, Drosophila melanogaster, Spodoptera frugiperda, and 
10 Trichoplusia ni (WO 89/046699; Carbonell et al., (1985) /. Virol 56:153; Wright (1986) Nature 
327:718; Smith et al M (1983) Mol Cell Biol 3:2156; and see generally, Fraser, et al (1989) In 
Vitro Cell Dev. Biol 25:225). 

Cells and cell culture media are commercially available for both direct and fusion expression of 
heterologous polypeptides in a baculovirus/expression system; cell culture technology is generally 
15 known to those skilled in the art. See, eg. Summers and Smith supra. 

The modified insect cells may then be grown in an appropriate nutrient medium, which allows for 
stable maintenance of the plasmid(s) present in the modified insect host. Where the expression 
product gene is under inducible control, the host may be grown to high density, and expression 
induced. Alternatively, where expression is constitutive, the product will be continuously expressed 

20 into the medium and the nutrient medium must be continuously circulated, while removing the 
product of interest and augmenting depleted nutrients. The product may be purified by such 
techniques as chromatography, eg. HPLC, affinity chromatography, ion exchange chromatography, 
etc.; electrophoresis; density gradient centrifugation; solvent extraction, or the like. As appropriate, 
the product may be further purified, as required, so as to remove substantially any insect proteins 

25 which are also secreted in the medium or result from lysis of insect cells, so as to provide a product 
which is at least substantially free of host debris, eg. proteins, lipids and polysaccharides. 

In order to obtain protein expression, recombinant host cells derived from the transformants are 
incubated under conditions which allow expression of the recombinant protein encoding sequence. 
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These conditions will vary, dependent upon the host cell selected. However, the conditions are 
readily ascertainable to those of ordinary skill in the art, based upon what is known in the art. 

iii. Plant Systems 

There are many plant cell culture and whole plant genetic expression systems known in the art. 
5 Exemplary plant cellular genetic expression systems include those described in patents, such as: 
US 5,693,506; US 5,659,122; and US 5,608,143. Additional examples of genetic expression in 
plant cell culture has been described by Zenk, Phytochemistry 30:3861-3863 (1991). Descriptions 
of plant protein signal peptides may be found in addition to the references described above in 
Vaulcombe et al., Mol Gen. Genet. 209:33-40 (1987); Chandler et al., Plant Molecular Biology 

10 3:407-418 (1984); Rogers, J. Biol Chem. 260:3731-3738 (1985); Rothstein et al., Gene 55:353-356 
(1987); Whittier et al., Nucleic Acids Research 15:2515-2535 (1987); Wirsel et al., Molecular 
Microbiology 3:3-14 (1989); Yu et al., Gene 122:247-253 (1992). A description of the regulation 
of plant gene expression by the phytohormone, gibberellic acid and secreted enzymes induced by 
gibberellic acid can be found in R.L. Jones and J. MacMillin, Gibberellins: in: Advanced Plant 

15 Physiology,. Malcolm B. Wilkins, ed., 1984 Pitman Publishing Limited, London, pp. 21-52. 
References that describe other metabolically-regulated genes: Sheen, Plant Cell, 2:1027- 
1038(1990); Maas et al., EMBO J. 9:3447-3452 (1990); Benkel and Hickey, Proc. Natl Acad. Sci. 
84:1337-1339(1987) 

Typically, using techniques known in the art, a desired polynucleotide sequence is inserted into an 
20 expression cassette comprising genetic regulatory elements designed for operation in plants. The 
expression cassette is inserted into a desired expression vector with companion sequences upstream 
and downstream from the expression cassette suitable for expression in a plant host. The 
companion sequences will be of plasmid or viral origin and provide necessary characteristics to the 
vector to permit the vectors to move DNA from an original cloning host, such as bacteria, to the 
25 desired plant host. The basic bacterial/plant vector construct will preferably provide a broad host 
range prokaryote replication origin; a prokaryote selectable marker; and, for Agrobacterium 
transformations, T DNA sequences for Agrobacterium-mediated transfer to plant chromosomes. 
Where the heterologous gene is not readily amenable to detection, the construct will preferably also 
have a selectable marker gene suitable for determining if a plant cell has been transformed. A 
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general review of suitable markers, for example for the members of the grass family, is found in 
Wilmink and Dons, 1993, Plant MoL Biol Reptr, 1 1(2): 165-185. 

Sequences suitable for permitting integration of the heterologous sequence into the plant genome 
are also recommended. These might include transposon sequences and the like for homologous 
5 recombination as well as Ti sequences which permit random insertion of a heterologous expression 
cassette into a plant genome. Suitable prokaryote selectable markers include resistance toward 
antibiotics such as ampicillin or tetracycline. Other DNA sequences encoding additional functions 
may also be present in the vector, as is known in the art. 

The nucleic acid molecules of the subject invention may be included into an expression cassette for 
10 expression of the protein(s) of interest. Usually, there will be only one expression cassette, 
although two or more are feasible. The recombinant expression cassette will contain in addition to 
the heterologous protein encoding sequence the following elements, a promoter region, plant 5' 
untranslated sequences, initiation codon depending upon whether or not the structural gene comes 
equipped with one, and a transcription and translation termination sequence. Unique restriction 
15 enzyme sites at the 5' and 3' ends of the cassette allow for easy insertion into a pre-existing vector. 

A heterologous coding sequence may be for any protein relating to the present invention. The 
sequence encoding the protein of interest will encode a signal peptide which allows processing and 
translocation of the protein, as appropriate, and will usually lack any sequence which might result 
in the binding of the desired protein of the invention to a membrane. Since, for the most part, the 

20 transcriptional initiation region will be for a gene which is expressed and translocated during 
germination, by employing the signal peptide which provides for translocation, one may also 
provide for translocation of the protein of interest. In this way, the protein(s) of interest will be 
translocated from the cells in which they are expressed and may be efficiently harvested. Typically 
secretion in seeds are across the aleurone or scutellar epithelium layer into the endosperm of the 

25 seed. While it is not required that the protein be secreted from the cells in which the protein is 
produced, this facilitates the isolation and purification of the recombinant protein. 

Since the ultimate expression of the desired gene product will be in a eucaryotic cell it is desirable 
to determine whether any portion of the cloned gene contains sequences which will be processed 
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out as introns by the host's splicosome machinery. If so, site-directed mutagenesis of the "intron" 
region may be conducted to prevent losing a portion of the genetic message as a false intron code, 
Reed and Maniatis, Cell 41 :95-105, 1985. 

The vector can be microinjected directly into plant cells by use of micropipettes to mechanically 
transfer the recombinant DNA. Crossway, Mol Gen. Genet, 202:179-185, 1985. The genetic 
material may also be transferred into the plant cell by using polyethylene glycol, Krens, et al., 
Nature, 296, 72-74, 1982. Another method of introduction of nucleic acid segments is high 
velocity ballistic penetration by small particles with the nucleic acid either within the matrix of 
small beads or particles, or on the surface, Klein, et al., Nature, 327, 70-73, 1987 and Knudsen and 
Muller, 1991, Planta, 185:330-336 teaching particle bombardment of barley endosperm to create 
transgenic barley. Yet another method of introduction would be fusion of protoplasts with other 
entities, either minicells, cells, lysosomes or other fusible lipid-surfaced bodies, Fraley, et ah, Proc. 
Natl Acad. ScL USA, 79, 1859-1863, 1982. 

The vector may also be introduced into the plant cells by electroporation. (Fromm et al., Proc. 
15 Natl Acad. ScL USA 82:5824, 1985). In this technique, plant protoplasts are electroporated in the 
presence of plasmids containing the gene construct. Electrical impulses of high field strength 
reversibly permeabilize biomembranes allowing the introduction of the plasmids. Electroporated 
plant protoplasts reform the cell wall, divide, and form plant callus. 

All plants from which protoplasts can be isolated and cultured to give whole regenerated plants can 
20 be transformed by the present invention so that whole plants are recovered which contain the 
transferred gene. It is known that practically all plants can be regenerated from cultured cells or 
tissues, including but not limited to all major species of sugarcane, sugar beet, cotton, fruit and 
other trees, legumes and vegetables. Some suitable plants include, for example, species from the 
genera Fragaria, Lotus, Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, Linum, 
25 Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, 
Datura, Hyoscyamus, Lycopersion, Nicotiana, Solanum, Petunia, Digitalis, Majorana, Cichorium, 
Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Hererocallis, Nemesia, Pelargonium, 
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Panicum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, Browaalia, Glycine, Lolium, 
Zea, Triticum, Sorghum, and Datura. 

Means for regeneration vary from species to species of plants, but generally a suspension of 
transformed protoplasts containing copies of the heterologous gene is first provided. Callus tissue 
5 is formed and shoots may be induced from callus and subsequently rooted. Alternatively, embryo 
formation can be induced from the protoplast suspension. These embryos germinate as natural 
embryos to form plants. The culture media will generally contain various amino acids and 
hormones, such as auxin and cytokinins. It is also advantageous to add glutamic acid and proline 
to the medium, especially for such species as corn and alfalfa. Shoots and roots normally develop 
10 simultaneously. Efficient regeneration will depend on the medium, on the genotype, and on the 
history of the culture. If these three variables are controlled, then regeneration is fully reproducible 
and repeatable. 

In some plant cell culture systems, the desired protein of the invention may be excreted or 
alternatively, the protein may be extracted from the whole plant. Where the desired protein of the 

15 invention is secreted into the medium, it may be collected. Alternatively, the embryos and 
embryoless-half seeds or other plant tissue may be mechanically disrupted to release any secreted 
protein between cells and tissues. The mixture may be suspended in a buffer solution to retrieve 
soluble proteins. Conventional protein isolation and purification methods will be then used to 
purify the recombinant protein. Parameters of time, temperature pH, oxygen, and volumes will be. 

20 adjusted through routine methods to optimize expression and recovery of heterologous protein. 

iv. Bacterial Systems 

Bacterial expression techniques are known in the art. A bacterial promoter is any DNA sequence 
capable of binding bacterial RNA polymerase and initiating the downstream (3') transcription of a 
coding sequence (eg, structural gene) into mRNA. A promoter will have a transcription initiation 
25 region which is usually placed proximal to the 5' end of the coding sequence. This transcription 
initiation region usually includes an RNA polymerase binding site and a transcription initiation 
site. A bacterial promoter may also have a second domain called an operator, that may overlap an 
adjacent RNA polymerase binding site at which RNA synthesis begins. The operator permits 
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negative regulated (inducible) transcription, as a gene repressor protein may bind the operator and 
thereby inhibit transcription of a specific gene. Constitutive expression may occur in the absence of 
negative regulatory elements, such as the operator. In addition, positive regulation may be achieved 
by a gene activator protein binding sequence, which, if present is usually proximal (5') to the RNA 
5 polymerase binding sequence. An example of a gene activator protein is the catabolite activator 
protein (CAP), which helps initiate transcription of the lac operon in Escherichia coli (E. coli) 
[Raibaud et al (1984) Annu. Rev. Genet 78:173]. Regulated expression may therefore be either 
positive or negative, thereby either enhancing or reducing transcription. 

Sequences encoding metabolic pathway enzymes provide particularly useful promoter sequences. 

10 Examples include promoter sequences derived from sugar metabolizing enzymes, such as 
galactose, lactose (lac) [Chang et al (1977) Nature 798:1056], and maltose. Additional examples 
include promoter sequences derived from biosynthetic enzymes such as tryptophan (trp) [Goeddel 
et al. (1980) Nuc. Acids Res. 8:4057; Yelverton et al. (1981) Nucl. Acids Res. 9:731; US 
patent 4,738,921; EP-A-0036776 and EP-A-0121775]. The g-laotamase (bla) promoter system 

15 [Weissmann (1981) "The cloning of interferon and other mistakes." In Interferon 3 (ed. I. 
Gresser)], bacteriophage lambda PL [Shimatake et al. (1981) Nature 292:128] and T5 [US 
patent 4,689,406] promoter systems also provide useful promoter sequences. 

In addition, synthetic promoters which do not occur in nature also function as bacterial promoters. 
For example, transcription activation sequences of one bacterial or bacteriophage promoter may be 

20 joined with the operon sequences of another bacterial or bacteriophage promoter, creating a 
synthetic hybrid promoter [US patent 4,551 ,433]. For example, the tac promoter is a hybrid trp-lac 
promoter comprised of both trp promoter and lac operon sequences that is regulated by the lac 
repressor [Amann et al (1983) Gene 25:167; de Boer et al. (1983) Proc. Natl Acad. Sci. 80:2]]. 
Furthermore, a bacterial promoter can include naturally occurring promoters of non-bacterial origin 

25 that have the ability to bind bacterial RNA polymerase and initiate transcription. A naturally 
occurring promoter of non-bacterial origin can also be coupled with a compatible RNA polymerase 
to produce high levels of expression of some genes in prokaryotes. The bacteriophage T7 RNA 
polymerase/promoter system is an example of a coupled promoter system [Studier et al. (1986) J. 
Mol Biol. 789:113; Tabor et al (1985) Proc Natl. Acad. Sci. 82:1074]. In addition, a hybrid 
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promoter can also be comprised of a bacteriophage promoter and an E. coli operator region (EPO- 
A-0 267 851). 

In addition to a functioning promoter sequence, an efficient ribosome binding site is also useful for 
the expression of foreign genes in prokaryotes. In E. coli, the ribosome binding site is called the 
5 Shine-Dalgarno (SD) sequence and includes an initiation codon (ATG) and a sequence 3-9 
nucleotides in length located 3-1 1 nucleotides upstream of the initiation codon [Shine et al (1975) 
Nature 254:34], The SD sequence is thought to promote binding of mRNA to the ribosome by the 
pairing of bases between the SD sequence and the 3' and of E. coli 16S rRNA [Steitz et al. (1979) 
"Genetic signals and nucleotide sequences in messenger RNA." In Biological Regulation and 
10 Development: Gene Expression (ed. R.F, Goldberger)]. To express eukaryotic genes and 
prokaryotic genes with weak ribosome-binding site [Sambrook et al. (1989) "Expression of cloned 
genes in Escherichia coli." In Molecular Cloning: A Laboratory Manual]. 

A DNA molecule may be expressed intracellularly. A promoter sequence may be directly linked 
with the DNA molecule, in which case the first amino acid at the N-terminus will always be a 
15 methionine, which is encoded by the ATG start codon. If desired, methionine at the N-terminus 
may be cleaved from the protein by in vitro incubation with cyanogen bromide or by either in vivo 
on in vitro incubation with a bacterial methionine N-terminal peptidase (EPO-A-0 219 237). 

Fusion proteins provide an alternative to direct expression. Usually, a DNA sequence encoding the 
N-terminal portion of an endogenous bacterial protein, or other stable protein, is fused to the 5' end 

20 of heterologous coding sequences. Upon expression, this construct will provide a fusion of the two 
amino acid sequences. For example, the bacteriophage lambda cell gene can be linked at the 5' 
terminus of a foreign gene and expressed in bacteria. The resulting fusion protein preferably retains 
a site for a processing enzyme (factor Xa) to cleave the bacteriophage protein from the foreign 
gene [Nagai et al (1984) Nature 309:810]. Fusion proteins can also be made with sequences from 

25 the lacL [Jia et al. (1987) Gene 60:197], trpE [Allen et al. (1987) J. Biotechnol. 5:93; Makoff et al. 
(1989) J. Gen. Microbiol 735:1 1], and Chey [EP-A-0 324 647] genes. The DNA sequence at the 
junction of the two amino acid sequences may or may not encode a cleavable site. Another 
example is a ubiquitin fusion protein. Such a fusion protein is made with the ubiquitin region that 
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preferably retains a site for a processing enzyme (eg. ubiquitin specific processing-protease) to 
cleave the ubiquitin from the foreign protein. Through this method, native foreign protein can be 
isolated [Miller et al (1989) Bio/Technology 7:698]. 

Alternatively, foreign proteins can also be secreted from the cell by creating chimeric DNA 
molecules that encode a fusion protein comprised of a signal peptide sequence fragment that 
provides for secretion of the foreign protein in bacteria [US patent 4,336,336]. The signal sequence 
fragment usually encodes a signal peptide comprised of hydrophobic amino acids which direct the 
secretion of the protein from the cell. The protein is either secreted into the growth media (gram- 
positive bacteria) or into the periplasmic space, located between the inner and outer membrane of 
the cell (gram-negative bacteria). Preferably there are processing sites, which can be cleaved either 
in vivo or in vitro encoded between the signal peptide fragment and the foreign gene. 

DNA encoding suitable signal sequences can be derived from genes for secreted bacterial proteins, 
such as the E. coli outer membrane protein gene (ompA) [Masui et al (1983), in: Experimental 
Manipulation of Gene Expression; Ghrayeb et al (1984) EMBO J. 3:2437] and the E. coli alkaline 
15 phosphatase signal sequence (phoA) [Oka et al (1985) Proc. Natl Acad ScL 82:7212]. As an 
additional example, the signal sequence of the alpha-amylase gene from various Bacillus strains 
can be used to secrete heterologous proteins from B. subtilis [Palva et al (1982) Proc. Natl Acad. 
ScL USA 79:5582; EP-A-0 244 042]. 

Usually, transcription termination sequences recognized by bacteria are regulatory regions located 
3' to the translation stop codon, and thus together with the promoter flank the coding sequence. 
These sequences direct the transcription of an mRNA which can be translated into the polypeptide 
encoded by the DNA. Transcription termination sequences frequently include DNA sequences of 
about 50 nucleotides capable of forming stem loop structures that aid in terminating transcription. 
Examples include transcription termination sequences derived from genes with strong promoters, 
such as the trp gene in E. coli as well as other biosynthetic genes. 

Usually, the above described components, comprising a promoter, signal sequence (if desired), 
coding sequence of interest, and transcription termination sequence, are put together into 
expression constructs. Expression constructs are often maintained in a replicon, such as an 
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extrachromosomal element (eg. plasmids) capable of stable maintenance in a host, such as bacteria. 
The replicon will have a replication system, thus allowing it to be maintained in a prokaryotic host 
either for expression or for cloning and amplification. In addition, a replicon may be either a high 
or low copy number plasmid. A high copy number plasmid will generally have a copy number 
5 ranging from about 5 to about 200, and usually about 10 to about 150. A host containing a high 
copy number plasmid will preferably contain at least about 10, and more preferably at least about 
20 plasmids. Either a high or low copy number vector may be selected, depending upon the effect 
of the vector and the foreign protein on the host. 

Alternatively, the expression constructs can be integrated into the bacterial genome with an 
integrating vector. Integrating vectors usually contain at least one sequence homologous to the 
bacterial chromosome that allows the vector to integrate. Integrations appear to result from 
recombinations between homologous DNA in the vector and the bacterial chromosome. For 
example, integrating vectors constructed with DNA from various Bacillus strains integrate into the 
Bacillus chromosome (EP-A- 0 127 328). Integrating vectors may also be comprised of 
bacteriophage or transposon sequences. 

Usually, extrachromosomal and integrating expression constructs may contain selectable markers 
to allow for the selection of bacterial strains that have been transformed. Selectable markers can be 
expressed in the bacterial host and may include genes which render bacteria resistant to drugs such 
as ampicillin, chloramphenicol, erythromycin, kanamycin (neomycin), and tetracycline [Davies et 
al (1978) Annu. Rev. Microbiol 32:469]. Selectable markers may also include biosynthetic genes, 
such as those in the histidine, tryptophan, and leucine biosynthetic pathways. 

Alternatively, some of the above described components can be put together in transformation 
vectors. Transformation vectors are usually comprised of a selectable market that is either 
maintained in a replicon or developed into an integrating vector, as described above. 

25 Expression and transformation vectors, either extra-chromosomal replicons or integrating vectors, 
have been developed for transformation into many bacteria. For example, expression vectors have 
been developed for, inter alia, the following bacteria: Bacillus subtilis [Palva et al (1982) Proc. 
Natl Acad. ScL USA 79:5582; EP-A-0 036 259 and EP-A-0 063 953; WO 84/04541], Escherichia 
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coli [Shimatake et al. (1981) Nature 292:128; Amann et al. (1985) Gene 40:183; Studier et al. 
(1986) J. Mol. Biol. 759:113; EP-A-0 036 776.EP-A-0 136 829 and EP-A-0 136 907], 
Streptococcus cremoris [Powell et al. (1988) Appl. Environ. Microbiol. 54:655]; Streptococcus 
lividans [Powell et al. (1988) Appl. Environ. Microbiol. 54:655], Streptomyces lividans [US patent 
5 4,745,056]. 

Methods of introducing exogenous DNA into bacterial hosts are well-known in the art, and usually 
include either the transformation of bacteria treated with CaCb or other agents, such as divalent 
cations and DMSO. DNA can also be introduced into bacterial cells by electroporation. 
Transformation procedures usually vary with the bacterial species to be transformed. See eg. 

10 [Masson et al. (1989) FEMS Microbiol. Lett. 60:213; Palva et al. (1982) Proc. Natl. Acad. Sci. 
USA 79:5582; EP-A-0 036 259 and EP-A-0 063 953; WO 84/04541, Bacillus], [Miller et al. (1988) 
Proc. Natl. Acad. Sci. 55:856; Wang et al. (1990) J. Bacteriol. 772:949, Campylobacter], [Cohen et 
al. (1973) Proc. Natl. Acad. Sci. 59:2110; Dower et al. (1988) Nucleic Acids Res. 76:6127; 
Kushner (1978) "An improved method for transformation of Escherichia coli with ColEl-derived 

15 plasmids. In Genetic Engineering: Proceedings of the International Symposium on Genetic 
Engineering (eds. H.W. Boyer and S. Nicosia); Mandel et al. (1970) J. Mol. Biol. 53:159; Taketo 
(1988) Biochim. Biophys. Acta 949:318; Escherichia], [Chassy et al. (1987) FEMS Microbiol Lett. 
44:173 Lactobacillus]; [Fiedler et al. (1988) Anal. Biochem 770:38, Pseudomonas]; [Augustin et al. 
(1990) FEMS Microbiol. Lett. 66:203, Staphylococcus], [Barany et al. (1980) J. Bacteriol. 

20 744:698; Harlander (1987) 'Transformation of Streptococcus lactis by electroporation, in: 
Streptococcal Genetics (ed. J. Ferretti and R. Curtiss III); Perry et al. (1981) Infect. Immun. 
32:1295; Powell et al. (1988) Appl. Environ. Microbiol 54:655; Somkuti et al. (1987) Proc. 4th 
Evr. Cong. Biotechnology 7:412, Streptococcus]. 

v. Yeast Expression 

25 Yeast expression systems are also known to one of ordinary skill in the art. A yeast promoter is any 
DNA sequence capable of binding yeast RNA polymerase and initiating the downstream (3') 
transcription of a coding sequence (eg. structural gene) into mRNA. A promoter will have a 
transcription initiation region which is usually placed proximal to the 5' end of the coding 
sequence. This transcription initiation region usually includes an RNA polymerase binding site (the 
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"TATA Box") and a transcription initiation site. A yeast promoter may also have a second domain 
called an upstream activator sequence (UAS), which, if present, is usually distal to the structural 
gene. The UAS permits regulated (inducible) expression. Constitutive expression occurs in the 
absence of a UAS. Regulated expression may be either positive or negative, thereby either 
5 enhancing or reducing transcription. 

Yeast is a fermenting organism with an active metabolic pathway, therefore sequences encoding 
enzymes in the metabolic pathway provide particularly useful promoter sequences. Examples 
include alcohol dehydrogenase (ADH) (EP-A-0 284 044), enolase, glucokinase, glucose-6- 
phosphate isomerase, glyceraldehyde-3-phosphate-dehydrogenase (GAP or GAPDH), hexokinase, 
10 phosphofructokinase, 3-phosphoglycerate mutase, and pyruvate kinase (PyK) (EPO-A-0 329 203). 
The yeast PH05 gene, encoding acid phosphatase, also provides useful promoter sequences 
[Myanohara etal (1983) Proc. Natl Acad, Scl USA 80:1]. 

In addition, synthetic promoters which do not occur in nature also function as yeast promoters. For 
example, UAS sequences of one yeast promoter may be joined with the transcription activation 

15 region of another yeast promoter, creating a synthetic hybrid promoter. Examples of such hybrid' 
promoters include the ADH regulatory sequence linked to the GAP transcription activation region 
(US Patent Nos. 4,876,197 and 4,880,734). Other examples of hybrid promoters include promoters 
which consist of the regulatory sequences of either the ADH2, GAL4, GAL10, OR PH05 genes, 
combined with the transcriptional activation region of a glycolytic enzyme gene such as GAP or 

20 PyK (EP-A-0 164 556). Furthermore, a yeast promoter can include naturally occurring promoters 
of non-yeast origin that have the ability to bind yeast RNA polymerase and initiate transcription. 
Examples of such promoters include, inter alia, [Cohen et al (1980) Proc. Natl Acad ScL USA 
77:1078; Henikoff et al (1981) Nature 283:835; Hollenberg et al (1981) Cum Topics Microbiol 
Immunol 96:119; Hollenberg et al (1979) "The Expression of Bacterial Antibiotic Resistance 

25 Genes in the Yeast Saccharomyces cerevisiae," in: Plasmids of Medical Environmental and 
Commercial Importance (eds. K.N. Timmis and A. Puhler); Mercerau-Puigalon et al (1980) Gene 
77:163; Panthier etal (1980) Cum Genet 2:109;]. 
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A DNA molecule may be expressed intracellularly in yeast. A promoter sequence may be directly 
linked with the DNA molecule, in which, case the first amino acid at the N-terminus of the 
recombinant protein will always be a methionine, which is encoded by the ATG start codon. If 
desired, methionine at the N-terminus may be cleaved from the protein by in vitro incubation with 
5 cyanogen bromide. 

Fusion proteins provide an alternative for yeast expression systems, as well as in mammalian, 
baculovirus, and bacterial expression systems. Usually, a DNA sequence encoding the N-terminal 
portion of an endogenous yeast protein, or other stable protein, is fused to the 5' end of 
heterologous coding sequences. Upon expression, this construct will provide a fusion of the two 

10 amino acid sequences. For example, the yeast or human superoxide dismutase (SOD) gene, can be 
linked at the 5' terminus of a foreign gene and expressed in yeast. The DNA sequence at the 
junction of the two amino acid sequences may or may not encode a cleavable site. See eg. EP-A-0 
196 056. Another example is a ubiquitin fusion protein. Such a fusion protein is made with the 
ubiquitin region that preferably retains a site for a processing enzyme (eg. ubiquitin-specific 

15 processing protease) to cleave the ubiquitin from the foreign protein. Through this method, 
therefore, native foreign protein can be isolated (eg. WO88/024066). 

Alternatively, foreign proteins can also be secreted from the cell into the growth media by creating 
chimeric DNA molecules that encode a fusion protein comprised of a leader sequence fragment 
that provide for secretion in yeast of the foreign protein. Preferably, there are processing sites 
20 encoded between the leader fragment and the foreign gene that can be cleaved either in vivo or in 
vitro. The leader sequence fragment usually encodes a signal peptide comprised of hydrophobic 
amino acids which direct the secretion of the protein from the cell. 

DNA encoding suitable signal sequences can be derived from genes for secreted yeast proteins, 
such as the yeast invertase gene (EP-A-0 012 873; JPO. 62,096,086) and the A-factor gene (US 
25 patent 4,588,684). Alternatively, leaders of non-yeast origin, such as an interferon leader, exist that 
also provide for secretion in yeast (EP-A-0 060 057). 

A preferred class of secretion leaders are those that employ a fragment of the yeast alpha-factor 
gene, which contains both a "pre" signal sequence, and a "pro" region. The types of alpha-factor 
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fragments that can be employed include the full-length pre-pro alpha factor leader (about 83 amino 
acid residues) as well as truncated alpha-factor leaders (usually about 25 to about 50 amino acid 
residues) (US Patents 4,546,083 and 4,870,008; EP-A-0 324 274). Additional leaders employing an 
alpha-factor leader fragment that provides for secretion include hybrid alpha-factor leaders made 
5 with a presequence of a first yeast, but a pro-region from a second yeast alphafactor. (eg. see WO 
89/02463.) 

Usually, transcription termination sequences recognized by yeast are regulatory regions located 3' 
to the translation stop codon, and thus together with the promoter flank the coding sequence. These 
sequences direct the transcription of an mRNA which can be translated into the polypeptide 
10 encoded by the DNA. Examples of transcription terminator sequence and other yeast-recognized 
termination sequences, such as those coding for glycolytic enzymes. 

Usually, the above described components, comprising a promoter, leader (if desired), coding 
sequence of interest, and transcription termination sequence, are put together into expression 
constructs. Expression constructs are often maintained in a replicon, such as an extrachromosomal 
element (eg. plasmids) capable of stable maintenance in a host, such as yeast or bacteria. The 
replicon may have two replication systems, thus allowing it to be maintained, for example, in yeast 
for expression and in a prokaryotic host for cloning and amplification. Examples of such yeast- 
bacteria shuttle vectors include YEp24 [Botstein et al (1979) Gene 8:17-24], pCl/1 [Brake et al 
(1984) Proc. Natl Acad. Sci USA 87:4642-4646], and YRpl7 [Stinchcomb et al (1982) J. Mol. 
Biol. 758:157]. In addition, a replicon may be either a high or low copy number plasmid. A high 
copy number plasmid will generally have a copy number ranging from about 5 to about 200, and 
usually about 10 to about 150. A host containing a high copy number plasmid will preferably have 
at least about 10, and more preferably at least about 20. Enter a high or low copy number vector 
may be selected, depending upon the effect of the vector and the foreign protein on the host. See 
eg. Brake et al, supra. 



15 
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Alternatively, the expression constructs can be integrated into the yeast genome with an integrating 
vector. Integrating vectors usually contain at least one sequence homologous to a yeast 
chromosome that allows the vector to integrate, and preferably contain two homologous sequences 
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flanking the expression construct. Integrations appear to result from recombinations between 
homologous DNA in the vector and the yeast chromosome [Orr- Weaver et al (1983) Methods in 
Enzymol 707:228-245]. An integrating vector may be directed to a specific locus in yeast by 
selecting the appropriate homologous sequence for inclusion in the vector. See Orr- Weaver et al, 
5 supra. One or more expression construct may integrate, possibly affecting levels of recombinant 
protein produced [Rine et al (1983) Proa Natl Acad ScL USA 80:6750]. The chromosomal 
sequences included in the vector can occur either as a single segment in the vector, which results in 
the integration of the entire vector, or two segments homologous to adjacent segments in the 
chromosome and flanking the expression construct in the vector, which can result in the stable 
10 integration of only the expression construct. 

Usually, extrachromosomal and integrating expression constructs may contain selectable markers 
to allow for the selection of yeast strains that have been transformed. Selectable markers may 
include biosynthetic genes that can be expressed in the yeast host, such as ADE2, HIS4, LEU2, 
TRP1, and ALG7, and the G418 resistance gene, which confer resistance in yeast cells to 
15 tunicamycin and G418, respectively. In addition, a suitable selectable marker may also provide 
yeast with the ability to grow in the presence of toxic compounds, such as metal. For example, the 
presence of CUP] allows yeast to grow in the presence of copper ions [Butt et al (1987) 
Microbiol Rev, 57:351]. 

Alternatively, some of the above described components can be put together into transformation 
20 vectors. Transformation vectors are usually comprised of a selectable marker that is either 
maintained in a replicon or developed into an integrating vector, as described above. 

Expression and transformation vectors, either extrachromosomal replicons or integrating vectors, 
have been developed for transformation into many yeasts. For example, expression vectors have 
been developed for, inter alia, the following yeasts:Candida albicans [Kurtz, et al (1986) Mol 
25 Cell Biol 6:142], Candida maltosa [Kunze, et al (1985) 7. Basic Microbiol 25:141]. Hansenula 
polymorpha [Gleeson, et al (1986) 7. Gen. Microbiol 7J2:3459; Roggenkamp et al (1986) Mol 
Gen. Genet 202:302], Kluyveromyces fragilis [Das, et al (1984) 7. Bacteriol 755:1165], 
Kluyveromyces lactis [De Louvencourt et al (1983) 7. Bacteriol 1 54:131 \ Van den Berg et al 
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(1990) Bio/Technology 5:135], Pichia guillerimondii [Kunze et al (1985) 7. Basic Microbiol 
25:141], Pichia pastoris [Cregg, et al. (1985) Mol Cell Biol 5:3376; US Patent Nos. 4,837,148 
and 4,929,555], Saccharomyces cerevisiae [Hinnen et al (1978) Proc. Natl Acad, Sci USA 
75:1929; Ito et al (1983) J. Bacteriol 753:163], Schizosaccharomyces pombe [Beach and Nurse 
5 (1981) Nature 300:706], and Yarrowia lipolytica [Davidow, et al (1985) Curr. Genet 70:380471 
Gaillardin, et al (1985) Curr. Genet 70:49]. 

Methods of introducing exogenous DNA into yeast hosts are well-known in the art, and usually 

include either the transformation of spheroplasts or of intact yeast cells treated with alkali cations. 

Transformation procedures usually vary with the yeast species to be transformed. See eg. [Kurtz et 
10 al (1986) Mol Cell Biol 6:142; Kunze et al (1985) 7. Basic Microbiol 25:141; Candida]; 

[Gleeson et al (1986) 7. Gen. Microbiol 732:3459; Roggenkamp et al (1986) Mol Gen. Genet. 

202:302; Hansenula]; [Das et al (1984) 7. Bacteriol 755:1165; De Louvencourt et al (1983) 7. 

Bacteriol 754:1165; Van den Berg et al. (1990) Bio/Technology S:135; Kluyveromyces]; [Cregg et 

al (1985) Mol Cell. Biol 5:3376; Kunze et al (1985) 7. Basic Microbiol 25:141; US Patent Nos. 
15 4,837,148 and 4,929,555; Pichia]; [Hinnen et al (1978) Proc. Natl Acad. Sci. USA 75;1929; Ito et 

al (1983) 7. Bacteriol 753:163 Saccharomyces]; [Beach and Nurse (1981) Nature 300:706; 

Schizosaccharomyces]; [Davidow et al (1985) Curr. Genet 70:39; Gaillardin et al (1985) Curr. 

Genet 70:49; Yarrowia]. 

Antibodies 

20 As used herein, the term "antibody" refers to a polypeptide or group of polypeptides composed of 
at least one antibody combining site. An "antibody combining site" is the three-dimensional 
binding space with an internal surface shape and charge distribution complementary to the features 
of an epitope of an antigen, which allows a binding of the antibody with the antigen. "Antibody" 
includes, for example, vertebrate antibodies, hybrid antibodies, chimeric antibodies, humanised 

25 antibodies, altered antibodies, univalent antibodies, Fab proteins, and single domain antibodies. 

Antibodies against the proteins of the invention are useful for affinity chromatography, 
immunoassays, and distinguishing/identifying Neisserial proteins. 
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Antibodies to the proteins of the invention, both polyclonal and monoclonal, may be prepared by 
conventional methods. In general, the protein is first used to immunize a suitable animal, 
preferably a mouse, rat, rabbit or goat. Rabbits and goats are preferred for the preparation of 
polyclonal sera due to the volume of serum obtainable, and the availability of labeled anti-rabbit 
5 and anti-goat antibodies. Immunization is generally performed by mixing or emulsifying the 
protein in saline, preferably in an adjuvant such as Freund's complete adjuvant, and injecting the 
mixture or emulsion parenterally (generally subcutaneously or intramuscularly). A dose of 50-200 
(ig/injection is typically sufficient. Immunization is generally boosted 2-6 weeks later with one or 
more injections of the protein in saline, preferably using Freund's incomplete adjuvant. One may 

10 alternatively generate antibodies by in vitro immunization using methods known in the art, which 
for the purposes of this invention is considered equivalent to in vivo immunization. Polyclonal 
antisera is obtained by bleeding the immunized animal into a glass or plastic container, incubating 
the blood at 25°C for one hour, followed by incubating at 4°C for 2-18 hours. The serum is 
recovered by centrifugation (eg. l,000g for 10 minutes). About 20-50 ml per bleed may be 

15 obtained from rabbits. 

Monoclonal antibodies are prepared using the standard method of Kohler & Milstein [Nature 
(1975) 256:495-96], or a modification thereof. Typically, a mouse or rat is immunized as described 
above. However, rather than bleeding the animal to extract serum, the spleen (and optionally 
several large lymph nodes) is removed and dissociated into single cells. If desired, the spleen cells 

20 may be screened (after removal of nonspecifically adherent cells) by applying a cell suspension to 
a plate or well coated with the protein antigen. B-cells expressing membrane-bound 
immunoglobulin specific for the antigen bind to the plate, and are not rinsed away with the rest of 
the suspension. Resulting B-cells, or all dissociated spleen cells, are then induced to fuse with 
myeloma cells to form hybridomas, and are cultured in a selective medium (eg. hypoxanthine, 

25 aminopterin, thymidine medium, "HAT"). The resulting hybridomas are plated by limiting 
dilution, and are assayed for the production of antibodies which bind specifically to the 
immunizing antigen (and which do not bind to unrelated antigens). The selected MAb-secreting 
hybridomas are then cultured either in vitro (eg. in tissue culture bottles or hollow fiber reactors), 
or in vivo (as ascites in mice). 
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If desired, the antibodies (whether polyclonal or monoclonal) may be labeled using conventional 
techniques. Suitable labels include fluorophores, chromophores, radioactive atoms (particularly 32 P 
and 125 I), electron-dense reagents, enzymes, and ligands having specific binding partners. Enzymes 
are typically detected by their activity. For example, horseradish peroxidase is usually detected by 
5 its ability to convert 3,3\5,5'-tetramethylbenzidine (TMB) to a blue pigment, quantifiable with a 
spectrophotometer. "Specific binding partner" refers to a protein capable of binding a ligand 
molecule with high specificity, as for example in the case of an antigen and a monoclonal antibody 
specific therefor. Other specific binding partners include biotin and avidin or streptavidin, IgG and 
protein A, and the numerous receptor-ligand couples known in the art. It should be understood that 

10 the above description is not meant to categorize the various labels into distinct classes, as the same 
label may serve in several different modes. For example, 125 I may serve as a radioactive label or as 
an electron-dense reagent. HRP may serve as enzyme or as antigen for a MAb. Further, one may 
combine various labels for desired effect. For example, MAbs and avidin also require labels in the 
practice of this invention: thus, one might label a MAb with biotin, and detect its presence with 

15 avidin labeled with 125 I, or with an anti-biotin MAb labeled with HRP. Other permutations and 
possibilities will be readily apparent to those of ordinary skill in the art, and are considered as 
equivalents within the scope of the instant invention. 

Pharmaceutical Compositions 

Pharmaceutical compositions can comprise either polypeptides, antibodies, or nucleic acid of the 
20 invention. The pharmaceutical compositions will comprise a therapeutically effective amount of 
either polypeptides, antibodies, or polynucleotides of the claimed invention. 

The term "therapeutically effective amount" as used herein refers to an amount of a therapeutic 
agent to treat, ameliorate, or prevent a desired disease or condition, or to exhibit a detectable 
therapeutic or preventative effect. The effect can be detected by, for example, chemical markers or 
25 antigen levels. Therapeutic effects also include reduction in physical symptoms, such as decreased 
body temperature. The precise effective amount for a subject will depend upon the subject's size 
and health, the nature and extent of the condition, and the therapeutics or combination of 
therapeutics selected for administration. Thus, it is not useful to specify an exact effective amount 
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in advance. However, the effective amount for a given situation can be determined by routine 
experimentation and is within the judgement of the clinician. 

For purposes of the present invention, an effective dose will be from about 0.01 mg/ kg to 50 
mg/kg or 0.05 mg/kg to about 10 mg/kg of the DNA constructs in the individual to which it is 
5 administered. 

A pharmaceutical composition can also contain a pharmaceutical^ acceptable carrier. The term 
"pharmaceutically acceptable carrier" refers to a carrier for administration of a therapeutic agent, 
such as antibodies or a polypeptide, genes, and other therapeutic agents. The term refers to any 
pharmaceutical carrier that does not itself induce the production of antibodies harmful to the 
10 individual receiving the composition, and which may be administered without undue toxicity. 
Suitable carriers may be large, slowly metabolized macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid 
copolymers, and inactive virus particles. Such carriers are well known to those of ordinary skill in 
the art. 

15 Pharmaceutically acceptable salts can be used therein, for example, mineral acid salts such as 
hydrochlorides, hydrobromides, phosphates, sulfates, and the like; and the salts of organic acids 
such as acetates, propionates, malonates, benzoates, and the like. A thorough discussion of 
pharmaceutically acceptable excipients is available in Remington's Pharmaceutical Sciences 
(Mack Pub. Co.,NJ. 1991). 

20 Pharmaceutically acceptable carriers in therapeutic compositions may contain liquids such as 
water, saline, glycerol and ethanol. Additionally, auxiliary substances, such as wetting or 
emulsifying agents, pH buffering substances, and the like, may be present in such vehicles. 
Typically, the therapeutic compositions are prepared as injectables, either as liquid solutions or 
suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection 

25 may also be prepared. Liposomes are included within the definition of a pharmaceutically 
acceptable carrier. 
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Deliverv Methods 

Once formulated, the compositions of the invention can be administered directly to the subject. The 
subjects to be treated can be animals; in particular, human subjects can be treated. 

Direct delivery of the compositions will generally be accomplished by injection, either 
5 subcutaneously, intraperitoneal^, intravenously or intramuscularly or delivered to the interstitial 
space of a tissue. The compositions can also be administered into a lesion. Other modes of 
administration include oral and pulmonary administration, suppositories, and transdermal or 
transcutaneous applications (eg. see WO98/20734), needles, and gene guns or hyposprays. Dosage 
treatment may be a single dose schedule or a multiple dose schedule. 

10 Vaccines 

Vaccines according to the invention may either be prophylactic (ie. to prevent infection) or 
therapeutic (ie. to treat disease after infection). 

Such vaccines comprise immunising antigen(s), immunogen(s), polypeptide(s), protein(s) or 
nucleic acid, usually in combination with "pharmaceutical^ acceptable carriers," which include 

15 any carrier that does not itself induce the production of antibodies harmful to the individual 
receiving the composition. Suitable carriers are typically large, slowly metabolized 
macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric 
amino acids, amino acid copolymers, lipid aggregates (such as oil droplets or liposomes), and 
inactive virus particles. Such carriers are well known to those of ordinary skill in the art. 

20 Additionally, these carriers may function as immunostimulating agents ("adjuvants"). Furthermore, 
the antigen or immunogen may be conjugated to a bacterial toxoid, such as a toxoid from 
diphtheria, tetanus, cholera, H. pylori, etc. pathogens. 

Preferred adjuvants to enhance effectiveness of the composition include, but are not limited to: (1) 
aluminum salts (alum), such as aluminum hydroxide, aluminum phosphate, aluminum sulfate, etc; 
25 (2) oil-in-water emulsion formulations (with or without other specific immunostimulating agents 
such as muramyl peptides (see below) or bacterial cell wall components), such as for example (a) 
MF59™ (WO 90/14837; Chapter 10 in Vaccine design: the subunit and adjuvant approach, eds. 
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Powell & Newman, Plenum Press 1995), containing 5% Squalene, 0.5% Tween 80, and 0.5% Span 
85 (optionally containing various amounts of MTP-PE (see below), although not required) 
formulated into submicron particles using a microfluidizer such as Model HOY microfluidizer 
(Microfluidics, Newton, MA), (b) SAF, containing 10% Squalane, 0.4% Tween 80, 5% pluronic- 
blocked polymer L121, and thr-MDP (see below) either microfluidized into a submicron emulsion 
or vortexed to generate a larger particle size emulsion, and (c) Ribi™ adjuvant system (RAS), 
(Ribi Immunochem, Hamilton, MT) containing 2% Squalene, 0.2% Tween 80, and one or more 
bacterial cell wall components from the group consisting of monophosphorylipid A (MPL), 
trehalose dimycolate (TDM), and cell wall skeleton (CWS), preferably MPL + CWS (Detox™); 
(3) saponin adjuvants, such as Stimulon™ (Cambridge Bioscience, Worcester, MA) may be used 
or particles generated therefrom such as ISCOMs (immunostimulating complexes); (4) Complete 
Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant (IF A); (5) cytokines, such as 
interleukins (eg. IL-1, IL-2, IL-4, IL-5, IL-6, IL-7, IL-12, etc.), interferons (eg. gamma interferon), 
macrophage colony stimulating factor (M-CSF), tumor necrosis factor (TNF), etc; and (6) other 
substances that act as immunostimulating agents to enhance the effectiveness of the composition. 
Alum and MF59™ are preferred. 

As mentioned above, muramyl peptides include, but are not limited to, N-acetyl-muramyl-L- 
threonyl-D-isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor-MDP), 
N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(r-2'-dipalmitoyl- l yn-glycero-3- 
20 hydroxyphosphoryloxy)-ethylamine (MTP-PE), etc. 

The immunogenic compositions (eg. the immunising antigen/immunogen/polypeptide/protein/ 
nucleic acid, pharmaceutical^ acceptable carrier, and adjuvant) typically will contain diluents, 
such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary substances, such as wetting or 
emulsifying agents, pH buffering substances, and the like, may be present in such vehicles. 

25 Typically, the immunogenic compositions are prepared as injectables, either as liquid solutions or 
suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection 
may also be prepared. The preparation also may be emulsified or encapsulated in liposomes for 
enhanced adjuvant effect, as discussed above under pharmaceutical ly acceptable carriers. 
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Immunogenic compositions used as vaccines comprise an immunologically effective amount of the 
antigenic or immunogenic polypeptides, as well as any other of the above-mentioned components, 
as needed. By "immunologically effective amount", it is meant that the administration of that 
amount to an individual, either in a single dose or as part of a series, is effective for treatment or 
5 prevention. This amount varies depending upon the health and physical condition of the individual 
to be treated, the taxonomic group of individual to be treated (eg. nonhuman primate, primate, 
etc.), the capacity of the individual's immune system to synthesize antibodies, the degree of 
protection desired, the formulation of the vaccine, the treating doctor's assessment of the medical 
situation, and other relevant factors. It is expected that the amount will fall in a relatively broad 
10 range that can be determined through routine trials. 

The immunogenic compositions are conventionally administered parenterally, eg. by injection, 
either subcutaneously, intramuscularly, or transdermally/transcutaneously (eg. WO98/20734). 
Additional formulations suitable for other modes of administration include oral and pulmonary 
formulations, suppositories, and transdermal applications. Dosage treatment may be a single dose 
15 schedule or a multiple dose schedule. The vaccine may be administered in conjunction with other 
immunoregulatory agents. 

As an alternative to protein-based vaccines, DNA vaccination may be employed [eg. Robinson & 
Torres (1997) Seminars in Immunology 9:271-283; Donnelly et al (1997) Annu Rev Immunol 
15:617-648; see later herein]. 

20 Gene Delivery Vehicles 

Gene therapy vehicles for delivery of constructs including a coding sequence of a therapeutic of the 
invention, to be delivered to the mammal for expression in the mammal, can be administered either 
locally or systemically. These constructs can utilize viral or non-viral vector approaches in in vivo 
or ex vivo modality. Expression of such coding sequence can be induced using endogenous 
25 mammalian or heterologous promoters. Expression of the coding sequence in vivo can be either 
constitutive or regulated. 

The invention includes gene delivery vehicles capable of expressing the contemplated nucleic acid 
sequences. The gene delivery vehicle is preferably a viral vector and, more preferably, a retroviral, 
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adenoviral, adeno-associated viral (AAV), herpes viral, or alphavirus vector. The viral vector can 
also be an astrovirus, coronavirus, orthomyxovirus, papovavirus, paramyxovirus, parvovirus, 
picornavirus, poxvirus, or togavirus viral vector. See generally, Jolly (1994) Cancer Gene Therapy 
1:51-64; Kimura (1994) Human Gene Therapy 5:845-852; Connelly (1995) Human Gene Therapy 
5 6:185-193; and Kaplitt (1994) Nature Genetics 6:148-153. 

Retroviral vectors are well known in the art and we contemplate that any retroviral gene therapy 
vector is employable in the invention, including B, C and D type retroviruses, xenotropic 
retroviruses (for example, NZB-X1, NZB-X2 and NZB9-1 (see O'Neill (1985) J. Virol 53:160) 
polytropic retroviruses eg. MCF and MCF-MLV (see Kelly (1983) /. Virol 45:291), spumaviruses 
10 and lentiviruses. See RNA Tumor Viruses, Second Edition, Cold Spring Harbor Laboratory, 1985. 

Portions of the retroviral gene therapy vector may be derived from different retroviruses. For 
example, retrovector LTRs may be derived from a Murine Sarcoma Virus, a tRNA binding site 
from a Rous Sarcoma Virus, a packaging signal from a Murine Leukemia Virus, and an origin of 
second strand synthesis from an Avian Leukosis Virus. 

15 These recombinant retroviral vectors may be used to generate transduction competent retroviral 
vector particles by introducing them into appropriate packaging cell lines (see US patent 
5,591,624). Retrovirus vectors can be constructed for site-specific integration into host cell DNA 
by incorporation of a chimeric integrase enzyme into the retroviral particle (see W096/37626). It is 
preferable that the recombinant viral vector is a replication defective recombinant virus. 

20 Packaging cell lines suitable for use with the above-described retrovirus vectors are well known in 
the art, are readily prepared (see WO95/30763 and WO92/05266), and can be used to create 
producer cell lines (also termed vector cell lines or "VCLs") for the production of recombinant 
vector particles. Preferably, the packaging cell lines are made from human parent cells {eg. 
HT1080 cells) or mink parent cell lines, which eliminates inactivation in human serum. 

25 Preferred retroviruses for the construction of retroviral gene therapy vectors include Avian 
Leukosis Virus, Bovine Leukemia, Virus, Murine Leukemia Virus, Mink-Cell Focus-Inducing 
Virus, Murine Sarcoma Virus, Reticuloendotheliosis Virus and Rous Sarcoma Virus. Particularly 
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preferred Murine Leukemia Viruses include 4070A and 1504A (Hartley and Rowe (1976) J Virol 
19:19-25), Abelson (ATCC No. VR-999), Friend (ATCC No. VR-245), Graffi, Gross (ATCC Nol 
VR-590), Kirsten, Harvey Sarcoma Virus and Rauscher (ATCC No. VR-998) and Moloney Murine 
Leukemia Virus (ATCC No. VR-190). Such retroviruses may be obtained from depositories or 
5 collections such as the American Type Culture Collection ("ATCC") in Rockville, Maryland or 
isolated from known sources using commonly available techniques. 

Exemplary known retroviral gene therapy vectors employable in this invention include those 
described in patent applications GB2200651, EP0415731, EP0345242, EP0334301, WO89/02468; 
WO89/05349, WO89/09271, WO90/02806, WO90/07936, WO94/03622, W093/25698, 

10 W093/25234, WO93/11230, WO93/10218, WO91/02805, WO91/02825, WO95/07994, US 
5,219,740, US 4,405,712, US 4,861,719, US 4,980,289, US 4,777,127, US 5,591,624. See also 
Vile (1993) Cancer Res 53:3860-3864; Vile (1993) Cancer Res 53:962-967; Ram (1993) Cancer 
Res 53 (1993) 83-88; Takamiya (1992) J Neurosci Res 33:493-503; Baba (1993) J Neurosurg 
79:729-735; Mann (1983) Cell 33:153; Cane (1984) Proc Natl Acad Sci 81:6349; and Miller 

15 (1990) Human Gene Therapy 1 . 

Human adenoviral gene therapy vectors are also known in the art and employable in this invention. 
See, for example, Berkner (1988) Biotechniques 6:616 and Rosenfeld (1991) Science 252:431, and 
WO93/07283, WO93/06223, and WO93/07282. Exemplary known adenoviral gene therapy vectors 
employable in this invention include those described in the above referenced documents and in 

20 W094/12649, WO93/03769, W093/19191, W094/28938, W095/11984, WO95/00655, 
WO95/27071, W095/29993, W095/34671, WO96/05320, WO94/08026, WO94/11506, 
WO93/06223, W094/24299, WO95/14102, W095/24297, WO95/02697, W094/28152, 
W094/24299, WO95/09241, WO95/25807, WO95/05835, W094/18922 and WO95/09654. 
Alternatively, administration of DNA linked to killed adenovirus as described in Curiel (1992) 

25 Hum. Gene Ther. 3:147-154 may be employed. The gene delivery vehicles of the invention also 
include adenovirus associated virus (AAV) vectors. Leading and preferred examples of such 
vectors for use in this invention are the AAV-2 based vectors disclosed in Srivastava, 
WO93/09239. Most preferred AAV vectors comprise the two AAV inverted terminal repeats in 
which the native D-sequences are modified by substitution of nucleotides, such that at least 5 
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native nucleotides and up to 18 native nucleotides, preferably at least 10 native nucleotides up to 
18 native nucleotides, most preferably 10 native nucleotides are retained and the remaining 
nucleotides of the D-sequence are deleted or replaced with non-native nucleotides. The native 
D-sequences of the AAV inverted terminal repeats are sequences of 20 consecutive nucleotides in 
5 each AAV inverted terminal repeat (ie. there is one sequence at each end) which are not involved 
in HP formation. The non-native replacement nucleotide may be any nucleotide other than the 
nucleotide found in the native D-sequence in the same position. Other employable exemplary AAV 
vectors are pWP-19, pWN-1, both of which are disclosed in Nahreini (1993) Gene 124:257-262. 
Another example of such an AAV vector is psub201 (see Samulski (1987) 7. Virol 61:3096). 

10 Another exemplary AAV vector is the Double-D ITR vector. Construction of the Double-D ITR 
vector is disclosed in US Patent 5,478,745. Still other vectors are those disclosed in Carter US 
Patent 4,797,368 and Muzyczka US Patent 5,139,941, Chartejee US Patent 5,474,935, and Kotin 
W094/288157. Yet a further example of an AAV vector employable in this invention is 
SSV9AFABTKneo, which contains the AFP enhancer and albumin promoter and directs 

15 expression predominantly in the liver. Its structure and construction are disclosed in Su (1996) 
Human Gene Therapy 7:463-470. Additional AAV gene therapy vectors are described in US 
5,354,678, US 5,173,414, US 5,139,941, and US 5,252,479. 

The gene therapy vectors of the invention also include herpes vectors. Leading and preferred 
examples are herpes simplex virus vectors containing a sequence encoding a thymidine kinase 

20 polypeptide such as those disclosed in US 5,288,641 and EP0176170 (Roizman). Additional 
exemplary herpes simplex virus vectors include HFEM/ICP6-LacZ disclosed in WO95/04139 
(Wistar Institute), pHSVlac described in Geller (1 988) Science 241 : 1667-1 669 and in WO90/09441 
and WO92/07945, HSV Us3::pgC-lacZ described in Fink (1992) Human Gene Therapy 3:11-19 
and HSV 7134, 2 RH 105 and GAL4 described in EP 0453242 (Breakefield), and those deposited 

25 with the ATCC as accession numbers ATCC VR-977 and ATCC VR-260. 

Also contemplated are alpha virus gene therapy vectors that can be employed in this invention. 
Preferred alpha virus vectors are Sindbis viruses vectors. Togaviruses, Semliki Forest virus (ATCC 
VR-67; ATCC VR-1247), Middleberg virus (ATCC VR-370), Ross River virus (ATCC VR-373; 
ATCC VR-1246), Venezuelan equine encephalitis virus (ATCC VR923; ATCC VR-1250; ATCC 
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VR-1249; ATCC VR-532), and those described in US patents 5,091,309, 5,217,879, and 
WO92/10578. More particularly, those alpha virus vectors described in US Serial No. 08/405,627, 
filed March 15, 1 995 ,W094/2 1792, WO92/10578, WO95/07994, US 5,091,309 and US 5,217,879 
are employable. Such alpha viruses may be obtained from depositories or collections such as the 
5 ATCC in Rockville, Maryland or isolated from known sources using commonly available 
techniques. Preferably, alphavirus vectors with reduced cytotoxicity are used (see USSN 
08/679640). 

DNA vector systems such as eukarytic layered expression systems are also useful for expressing 
the nucleic acids of the invention. See WO95/07994 for a detailed description of eukaryotic layered 
10 expression systems. Preferably, the eukaryotic layered expression systems of the invention are 
derived from alphavirus vectors and most preferably from Sindbis viral vectors. 

Other viral vectors suitable for use in the present invention include those derived from poliovirus, 
for example ATCC VR-58 and those described in Evans, Nature 339 (1989) 385 and Sabin (1973) 
J. Biol. Standardization 1:115; rhinovirus, for example ATCC VR-1110 and those described in 

15 Arnold (1990) J Cell Biochem L401; pox viruses such as canary pox virus or vaccinia virus, for 
example ATCC VR-1 1 1 and ATCC VR-2010 and those described in Fisher-Hoch (1989) Proc Natl 
Acad Sci 86:317; Flexner (1989) Ann NY Acad Sci 569:86, Flexner (1990) Vaccine 8:17; in US 
4,603,1 12 and US 4,769,330 and WO89/01973; SV40 virus, for example ATCC VR-305 and those 
described in Mulligan (1979) Nature 277:108 and Madzak (1992) J Gen Virol 73:1533; influenza 

20 virus, for example ATCC VR-797 and recombinant influenza viruses made employing reverse 
genetics techniques as described in US 5,166,057 and in Enami (1990) Proc Natl Acad Sci 
87:3802-3805; Enami & Palese (1991) J Virol 65:2711-2713 and Luytjes (1989) Cell 59:110, (see 
also McMichael (1983) NEJ Med 309:13, and Yap (1978) Nature 273:238 and Nature (1979) 
277:108); human immunodeficiency virus as described in EP-0386882 and in Buchschacher (1992) 

25 J. Virol. 66:2731; measles virus, for example ATCC VR-67 and VR-1247 and those described in 
EP-0440219; Aura virus, for example ATCC VR-368; Bebaru virus, for example ATCC VR-600 
and ATCC VR-1 240; Cabassou virus, for example ATCC VR-922; Chikungunya virus, for 
example ATCC VR-64 and ATCC VR-1 241; Fort Morgan Virus, for example ATCC VR-924; 
Getah virus, for example ATCC VR-369 and ATCC VR-1 243; Kyzylagach virus, for example 
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ATCC VR-927; Mayaro virus, for example ATCC VR-66; Mucambo virus, for example ATCC 
VR-580 and ATCC VR-1244; Ndumu virus, for example ATCC VR-371; Pixuna virus, for 
example ATCC VR-372 and ATCC VR-1245; Tonate virus, for example ATCC VR-925; Triniti 
virus, for example ATCC VR-469; Una virus, for example ATCC VR-374; Whataroa virus, for 
5 example ATCC VR-926; Y-62-33 virus, for example ATCC VR-375; O'Nyong virus, Eastern 
encephalitis virus, for example ATCC VR-65 and ATCC VR-1242; Western encephalitis virus, for 
example ATCC VR-70, ATCC VR-1251, ATCC VR-622 and ATCC VR-1252; and coronavirus, 
for example ATCC VR-740 and those described in Hamre ( 1 966) Proc Soc Exp Biol Med 1 2 1 : 1 90. 

Delivery of the compositions of this invention into cells is not limited to the above mentioned viral 
10 vectors. Other delivery methods and media may be employed such as, for example, nucleic acid 
expression vectors, polycationic condensed DNA linked or unlinked to killed adenovirus alone, for 
example see US Serial No. 08/366,787, filed December 30, 1994 and Curiel (1992) Hum Gene 
Ther 3:147-154 ligand linked DNA, for example see Wu (1989) J Biol Chem 264:16985-16987, 
eucaryotic cell delivery vehicles cells, for example see US Serial No.08/240,030, filed May 9, 
15 1994, and US Serial No. 08/404,796, deposition of photopolymerized hydrogel materials, 
hand-held gene transfer particle gun, as described in US Patent 5,149,655, ionizing radiation as 
described in US5,206,152 and in WO92/11033, nucleic charge neutralization or fusion with cell 
membranes. Additional approaches are described in Philip (1994) Mol Cell Biol 14:241 1-2418 and 
in Woffendin (1994) Proc Natl Acad Sci 91:1581-1585. 

20 Particle mediated gene transfer may be employed, for example see US Serial No. 60/023,867. 
Briefly, the sequence can be inserted into conventional vectors that contain conventional control 
sequences for high level expression, and then incubated with synthetic gene transfer molecules 
such as polymeric DNA-binding cations like polylysine, protamine, and albumin, linked to cell 
targeting ligands such as asialoorosomucoid, as described in Wu & Wu (1987) 7. Biol Chem, 

25 262:4429-4432, insulin as described in Hucked (1990) Biochem Pharmacol 40:253-263, galactose 
as described in Plank (1992) Bioconjugate Chem 3:533-539, lactose or transferrin. 

Naked DNA may also be employed. Exemplary naked DNA introduction methods are described in 
WO 90/11092 and US 5,580,859. Uptake efficiency may be improved using biodegradable latex 
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beads. DNA coated latex beads are efficiently transported into cells after endocytosis initiation by 
the beads. The method may be improved further by treatment of the beads to increase 
hydrophobicity and thereby facilitate disruption of the endosome and release of the DNA into the 
cytoplasm. 

Liposomes that can act as gene delivery vehicles are described in US 5,422,120, W095/13796, 
W094/23697, W091/14445 and EP-524,968. As described in USSR 60/023,867, on non-viral 
delivery, the nucleic acid sequences encoding a polypeptide can be inserted into conventional 
vectors that contain conventional control sequences for high level expression, and then be 
incubated with synthetic gene transfer molecules such as polymeric DNA-binding cations like 
polylysine, protamine, and albumin, linked to cell targeting ligands such as asialoorosomucoid, 
insulin, galactose, lactose, or transferrin. Other delivery systems include the use of liposomes to 
encapsulate DNA comprising the gene under the control of a variety of tissue-specific or 
ubiquitously-active promoters. Further non-viral delivery suitable for use includes mechanical 
delivery systems such as the approach described in Woffendin et al (1994) Proc. Natl Acad. ScL 
USA 91 (24): 1 1581-1 1585. Moreover, the coding sequence and the product of expression of such 
can be delivered through deposition of photopolymerized hydrogel materials. Other conventional 
methods for gene delivery that can be used for delivery of the coding sequence include, for 
example, use of hand-held gene transfer particle gun, as described in US 5,149,655; use of ionizing 
radiation for activating transferred gene, as described in US 5,206,152 and W092/1 1033 

20 Exemplary liposome and polycationic gene delivery vehicles are those described in US 5,422,120 
and 4,762,915; inWO 95/13796; W094/23697; and W091/14445; in EP-0524968; and in Stryer, 
Biochemistry, pages 236-240 (1975) W.H. Freeman, San Francisco; Szoka (1980) Biochem 
Biophys Acta 600:1; Bayer (1979) Biochem Biophys Acta 550:464; Rivnay (1987) Meth Enzymol 
149:1 19; Wang (1987) Proc Natl Acad Sci 84:7851 ; Plant (1989) Anal Biochem 176:420. 

25 A polynucleotide composition can comprises therapeutically effective amount of a gene therapy 
vehicle, as the term is defined above. For purposes of the present invention, an effective dose will 
be from about 0.01 mg/ kg to 50 mg/kg or 0.05 mg/kg to about 10 mg/kg of the DNA constructs in 
the individual to which it is administered. 
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Deliverv Methods 

Once formulated, the polynucleotide compositions of the invention can be administered (1) directly 
to the subject; (2) delivered ex vivo, to cells derived from the subject; or (3) in vitro for expression 
of recombinant proteins. The subjects to be treated can be mammals or birds. Also, human subjects 
5 can be treated. 

Direct delivery of the compositions will generally be accomplished by injection, either 
subcutaneously, intraperitoneally, intravenously or intramuscularly or delivered to the interstitial 
space of a tissue. The compositions can also be administered into a lesion. Other modes of 
administration include oral and pulmonary administration, suppositories, and transdermal or 
10 transcutaneous applications (eg. see WO98/20734), needles, and gene guns or hyposprays. Dosage 
treatment may be a single dose schedule or a multiple dose schedule. 

Methods for the ex vivo delivery and reimplantation of transformed cells into a subject are known 
in the art and described in eg. W093/14778. Examples of cells useful in ex vivo applications 
include, for example, stem cells, particularly hematopoetic, lymph cells, macrophages, dendritic 
15 cells, or tumor cells. 

Generally, delivery of nucleic acids for both ex vivo and in vitro applications can be accomplished 
by the following procedures, for example, dextran-mediated transfection, calcium phosphate 
precipitation, polybrene mediated transfection, protoplast fusion, electroporation, encapsulation of 
the polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei, all well 
20 known in the art. 

Polynucleotide and polypeptide pharmaceutical compositions 

In addition to the pharmaceutical^ acceptable carriers and salts described above, the following 
additional agents can be used with polynucleotide and/or polypeptide compositions. 

A.Polvpeptides 

25 One example are polypeptides which include, without limitation: asioloorosomucoid (ASOR); 
transferrin; asialoglycoproteins; antibodies; antibody fragments; ferritin; interleukins; interferons, 
granulocyte, macrophage colony stimulating factor (GM-CSF), granulocyte colony stimulating 
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factor (G-CSF), macrophage colony stimulating factor (M-CSF), stem cell factor and 
erythropoietin. Viral antigens, such as envelope proteins, can also be used. Also, proteins from 
other invasive organisms, such as the 17 amino acid peptide from the circumsporozoite protein of 
Plasmodium falciparum known as RH. 

5 B. Hormones, Vitamins, etc. 

Other groups that can be included are, for example: hormones, steroids, androgens, estrogens, 
thyroid hormone, or vitamins, folic acid. 

C. Polvalkvlenes, Polysaccharides, etc. 

Also, polyalkylene glycol can be included with the desired polynucleotides/polypeptides. In a 
10 preferred embodiment, the polyalkylene glycol is polyethlylene glycol. In addition, mono-, di-, or 
polysaccarides can be included. In a preferred embodiment of this aspect, the polysaccharide is 
dextran or DEAE-dextran. Also, chitosan and poly(lactide-co-glycolide) 

D. Lipids, and Liposomes 

The desired polynucleotide/polypeptide can also be encapsulated in lipids or packaged in 
15 liposomes prior to delivery to the subject or to cells derived therefrom. 

Lipid encapsulation is generally accomplished using liposomes which are able to stably bind or 
entrap and retain nucleic acid. The ratio of condensed polynucleotide to lipid preparation can vary 
but will generally be around 1:1 (mg DNA:micromoles lipid), or more of lipid. For a review of the 
use of liposomes as carriers for delivery of nucleic acids, see, Hug and Sleight (1991) Biochim. 
20 Biophys. Acta. 1097:1-17; Straubinger (1983) Meth. Enzymol 101:512-527. 

Liposomal preparations for use in the present invention include cationic (positively charged), 
anionic (negatively charged) and neutral preparations. Cationic liposomes have been shown to 
mediate intracellular delivery of plasmid DNA (Feigner (1987) Proc. Natl Acad ScL USA 
84:7413-7416); mRNA (Malone (1989) Proc. Natl Acad. Sci. USA 86:6077-6081); and purified 
25 transcription factors (Debs (1990) J. Biol. Chem. 265:10189-10192), in functional form. 
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Cationic liposomes are readily available. For example, 

N[l-2,3-dioleyloxy)propyl]-N,N,N-triethylammonium (DOTMA) liposomes are available under 
the trademark Lipofectin, from GIBCO BRL, Grand Island, NY. (See, also, Feigner supra). Other 
commercially available liposomes include transfectace (DDAB/DOPE) and DOTAP/DOPE 
5 (Boerhinger). Other cationic liposomes can be prepared from readily available materials using 
techniques well known in the art. See, eg. Szoka (1978) Proc. Natl Acad. Sci. USA 75:4194-4198; 
WO90/11092 for a description of the synthesis of DOTAP 
( 1 ,2-bis(oleoyloxy)-3-(trimethylammonio)propane) liposomes. 

Similarly, anionic and neutral liposomes are readily available, such as from Avanti Polar Lipids 
(Birmingham, AL), or can be easily prepared using readily available materials. Such materials 
include phosphatidyl choline, cholesterol, phosphatidyl ethanolamine, dioleoylphosphatidyl choline 
(DOPC), dioleoylphosphatidyl glycerol (DOPG), dioleoylphoshatidyl ethanolamine (DOPE), 
among others. These materials can also be mixed with the DOTMA and DOTAP starting materials 
in appropriate ratios. Methods for making liposomes using these materials are well known in the 
art. 

The liposomes can comprise multilammelar vesicles (MLVs), small unilamellar vesicles (SUVs), 
or large unilamellar vesicles (LUVs). The various liposome-nucleic acid complexes are prepared 
using methods known in the art. See eg. Straubinger (1983) Meth. Immunol 101:512-527; Szoka 
(1978) Proc. Natl Acad. ScL USA 75:4194-4198; Papahadjopoulos (1975) Biochim. Biophys. Acta 
20 394:483; Wilson (1979) Cell 17:77); Deamer & Bangham (1976) Biochim. Biophys. Acta 443:629; 
Ostro (1977) Biochem. Biophys. Res. Commun. 76:836; Fraley (1979) Proc. Natl. Acad. Sci. USA 
76:3348); Enoch & Strittmatter (1979) Proa Natl Acad. Sci. USA 76:145; Fraley (1980) J. Biol 
Chem. (1980) 255:10431; Szoka & Papahadjopoulos (1-978) Proc. Natl Acad. Sci USA 75:145; 
and Schaefer-Ridder (1982) Science 215:166. 

25 E.Lipoproteins 

In addition, lipoproteins can be included with the polynucleotide/polypeptide to be delivered. 
Examples of lipoproteins to be utilized include: chylomicrons, HDL, IDL, LDL, and VLDL. 
Mutants, fragments, or fusions of these proteins can also be used. Also, modifications of naturally 
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occurring lipoproteins can be used, such as acetylated LDL. These lipoproteins can target the 
delivery of polynucleotides to cells expressing lipoprotein receptors. Preferably, if lipoproteins are 
including with the polynucleotide to be delivered, no other targeting ligand is included in the 
composition. 

5 Naturally occurring lipoproteins comprise a lipid and a protein portion. The protein portion are 
known as apoproteins. At the present, apoproteins A, B, C, D, and E have been isolated and 
identified. At least two of these contain several proteins, designated by Roman numerals, AI, All, 
AIV; CI, CII, CIII. 

A lipoprotein can comprise more than one apoprotein. For example, naturally occurring 
10 chylomicrons comprises of A, B, C, and E, over time these lipoproteins lose A and acquire C and E 
apoproteins. VLDL comprises A, B, C, and E apoproteins, LDL comprises apoprotein B; and HDL 
comprises apoproteins A, C, and E. 

The amino acid of these apoproteins are known and are described in, for example, Breslow (1985) 
Annu Rev. Biochem 54:699; Law (1986) Adv. Exp Med. Biol. 151:162; Chen (1986) J Biol Chem 
15 261:12918; Kane (1980) Proc Natl Acad Sci USA 77:2465; and Utermann (1984) Hum Genet 
65:232. 

Lipoproteins contain a variety of lipids including, triglycerides, cholesterol (free and esters), and 
phopholipids. The composition of the lipids varies in naturally occurring lipoproteins. For example, 
chylomicrons comprise mainly triglycerides. A more detailed description of the lipid content of 
20 naturally occurring lipoproteins can be found, for example, in Metk Enzymol 128 (1986). The 
composition of the lipids are chosen to aid in conformation of the apoprotein for receptor binding 
activity. The composition of lipids can also be chosen to facilitate hydrophobic interaction and 
association with the polynucleotide binding molecule. 

Naturally occurring lipoproteins can be isolated from serum by ultracentrifugation, for instance. 
25 Such methods are described in Metk Enzymol (supra); Pitas (1980) 7. Biochem. 255:5454-5460 
and Mahey (1979) J Clin. Invest 64:743-750. Lipoproteins can also be produced by in vitro or 
recombinant methods by expression of the apoprotein genes in a desired host cell. See, for 
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example, Atkinson (1986) Annu Rev Biophys Chem 15:403 and Radding (1958) Biochim Biophys 
Acta 30: 443. Lipoproteins can also be purchased from commercial suppliers, such as Biomedical 
Technologies, Inc., Stoughton, Massachusetts, USA. Further description of lipoproteins can be 
found in Zuckermann et al PCT/US97/14465. 

5 F.Polvcationic Agents 

Polycationic agents can be included, with or without lipoprotein, in a composition with the desired 
polynucleotide/polypeptide to be delivered. 

Polycationic agents, typically, exhibit a net positive charge at physiological relevant pH and are 
capable of neutralizing the electrical charge of nucleic acids to facilitate delivery to a desired 
10 location. These agents have both in vitro, ex vivo, and in vivo applications. Polycationic agents can 
be used to deliver nucleic acids to a living subject either intramuscularly, subcutaneously, etc. 

The following are examples of useful polypeptides as polycationic agents: polylysine, 
polyarginine, polyornithine, and protamine. Other examples include histones, protamines, human 
serum albumin, DNA binding proteins, non-histone chromosomal proteins, coat proteins from 
15 DNA viruses, such as (XI 74, transcriptional factors also contain domains that bind DNA and 
therefore may be useful as nucleic aid condensing agents. Briefly, transcriptional factors such as 
C/CEBP, c-jun, c-fos, AP-1, AP-2, AP-3, CPF, ProM, Sp-1, Oct-1, Oct-2, CREP, and TFIID 
contain basic domains that bind DNA sequences. 

Organic polycationic agents include: spermine, spermidine, and purtrescine. 

20 The dimensions and of the physical properties of a polycationic agent can be extrapolated from the 
list above, to construct other polypeptide polycationic agents or to produce synthetic polycationic 
agents. 

Synthetic polycationic agents which are useful include, for example, DEAE-dextran, polybrene. 
Lipofectin™, and lipofectAMINE™ are monomers that form polycationic complexes when 
25 combined with polynucleotides/polypeptides. 
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Immunodiasnostic Assays 

Neisserial antigens of the invention can be used in immunoassays to detect antibody levels (or, 
conversely, anti-Neisserial antibodies can be used to detect antigen levels). Immunoassays based 
on well defined, recombinant antigens can be developed to replace invasive diagnostics methods. 
5 Antibodies to Neisserial proteins within biological samples, including for example, blood or serum 
samples, can be detected. Design of the immunoassays is subject to a great deal of variation, and a 
variety of these are known in the art. Protocols for the immunoassay may be based, for example, 
upon competition, or direct reaction, or sandwich type assays. Protocols may also, for example, use 
solid supports, or may be by immunoprecipitation. Most assays involve the use of labeled antibody 
10 or polypeptide; the labels may be, for example, fluorescent, chemiluminescent, radioactive, or dye 
molecules. Assays which amplify the signals from the probe are also known; examples of which 
are assays which utilize biotin and avidin, and enzyme-labeled and mediated immunoassays, such 
as ELISA assays. 

Kits suitable for immunodiagnosis and containing the appropriate labeled reagents are constructed 
15 by packaging the appropriate materials, including the compositions of the invention, in suitable 
containers, along with the remaining reagents and materials (for example, suitable buffers, salt 
solutions, etc.) required for the conduct of the assay, as well as suitable set of assay instructions. 

Nucleic Acid Hybridisation 

"Hybridization" refers to the association of two nucleic acid sequences to one another by hydrogen 
20 bonding. Typically, one sequence will be fixed to a solid support and the other will be free in 

solution. Then, the two sequences will be placed in contact with one another under conditions that 

favor hydrogen bonding. Factors that affect this bonding include: the type and volume of solvent; 

reaction temperature; time of hybridization; agitation; agents to block the non-specific attachment 

of the liquid phase sequence to the solid support (Denhardt's reagent or BLOTTO); concentration 
25 of the sequences; use of compounds to increase the rate of association of sequences (dextran sulfate 

or polyethylene glycol); and the stringency of the washing conditions following hybridization. See 

Sambrook et al [supra] Volume 2, chapter 9, pages 9.47 to 9.57. 
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"Stringency" refers to conditions in a hybridization reaction that favor association of very similar 
sequences over sequences that differ. For example, the combination of temperature and salt 
concentration should be chosen that is approximately 120 to 200DC below the calculated Tm of 
the hybrid under study. The temperature and salt conditions can often be determined empirically in 
5 preliminary experiments in which samples of genomic DNA immobilized on filters are hybridized 
to the sequence of interest and then washed under conditions of different stringencies. See 
Sambrook et al at page 9.50. 

Variables to consider when performing, for example, a Southern blot are (1) the complexity of the 
DNA being blotted and (2) the homology between the probe and the sequences being detected. The 
total amount of the fragment(s) to be studied can vary a magnitude of 10, from 0.1 to Ifxg for a 
plasmid or phage digest to 10" 9 to 10~ 8 g for a single copy gene in a highly complex eukaryotic 
genome. For lower complexity polynucleotides, substantially shorter blotting, hybridization, and 
exposure times, a smaller amount of starting polynucleotides, and lower specific activity of probes 
can be used. For example, a single-copy yeast gene can be detected with an exposure time of only 1 
hour starting with 1 \ig of yeast DNA, blotting for two hours, and hybridizing for 4-8 hours with a 
probe of 10 8 cpm/^ig. For a single-copy mammalian gene a conservative approach would start with 
10 ^g of DNA, blot overnight, and hybridize overnight in the presence of 10% dextran sulfate 
using a probe of greater than 10 8 cpm/fxg, resulting in an exposure time of -24 hours. 

Several factors can affect the melting temperature (Tm) of a DNA-DNA hybrid between the probe 
20 and the fragment of interest, and consequently, the appropriate conditions for hybridization and 
washing. In many cases the probe is not 100% homologous to the fragment. Other commonly 
encountered variables include the length and total G+C content of the hybridizing sequences and 
the ionic strength and formamide content of the hybridization buffer. The effects of all of these 
factors can be approximated by a single equation: 

25 Tm= 81 + 16.6(log 10 Ci) + 0.4[%(G + C)]-0.6(%formamide) - 600/n-l .5(%mismatch). 

where Ci is the salt concentration (monovalent ions) and n is the length of the hybrid in base pairs 
(slightly modified from Meinkoth & Wahl (1984) Anal Biochem. 138: 267-284). 
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In designing a hybridization experiment, some factors affecting nucleic acid hybridization can be 
conveniently altered. The temperature of the hybridization and washes and the salt concentration 
during the washes are the simplest to adjust. As the temperature of the hybridization increases (ie. 
stringency), it becomes less likely for hybridization to occur between strands that are 
nonhomologous, and as a result, background decreases. If the radiolabeled probe is not completely 
homologous with the immobilized fragment (as is frequently the case in gene family and 
interspecies hybridization experiments), the hybridization temperature must be reduced, and 
background will increase. The temperature of the washes affects the intensity of the hybridizing 
band and the degree of background in a similar manner. The stringency of the washes is also 
increased with decreasing salt concentrations. 

In general, convenient hybridization temperatures in the presence of 50% formamide are 42°C for a 
probe with is 95% to 100% homologous to the target fragment, 37°C for 90% to 95% homology, 
and 32°C for 85% to 90% homology. For lower homologies, formamide content should be lowered 
and temperature adjusted accordingly, using the equation above. If the homology between the 
15 probe and the target fragment are not known, the simplest approach is to start with both 
hybridization and wash conditions which are nonstringent. If non-specific bands or high 
background are observed after autoradiography, the filter can be washed at high stringency and 
reexposed. If the time required for exposure makes this approach impractical, several hybridization 
and/or washing stringencies should be tested in parallel. 

20 Nucleic Acid Probe Assays 

Methods such as PGR, branched DNA probe assays, or blotting techniques utilizing nucleic acid 
probes according to the invention can determine the presence of cDNA or mRNA. A probe is said 
to "hybridize" with a sequence of the invention if it can form a duplex or double stranded complex, 
which is stable enough to be detected. 

25 The nucleic acid probes will hybridize to the Neisserial nucleotide sequences of the invention 
(including both sense and antisense strands). Though many different nucleotide sequences will 
encode the amino acid sequence, the native Neisserial sequence is preferred because it is the actual 
sequence present in cells. mRNA represents a coding sequence and so a probe should be 



CHIR-0160 (356.001) PATENT 

-50- 

complementary to the coding sequence; single-stranded cDNA is complementary to mRNA, and so 
a cDNA probe should be complementary to the non-coding sequence. 

The probe sequence need not be identical to the Neisserial sequence (or its complement) — some 
variation in the sequence and length can lead to increased assay sensitivity if the nucleic acid probe 
5 can form a duplex with target nucleotides, which can be detected. Also, the nucleic acid probe can 
include additional nucleotides to stabilize the formed duplex. Additional Neisserial sequence may 
also be helpful as a label to detect the formed duplex. For example, a non-complementary 
nucleotide sequence may be attached to the 5' end of the probe, with the remainder of the probe 
sequence being complementary to a Neisserial sequence. Alternatively, n on -complementary bases 
10 or longer sequences can be interspersed into the probe, provided that the probe sequence has 
sufficient complementarity with the a Neisserial sequence in order to hybridize therewith and 
thereby form a duplex which can be detected. 

The exact length and sequence of the probe will depend on the hybridization conditions, such as 
temperature, salt condition and the like. For example, for diagnostic applications, depending on the 
15 complexity of the analyte sequence, the nucleic acid probe typically contains at least 10-20 
nucleotides, preferably 15-25, and more preferably at least 30 nucleotides, although it may be 
shorter than this. Short primers generally require cooler temperatures to form sufficiently stable 
hybrid complexes with the template. 

Probes may be produced by synthetic procedures, such as the triester method of Matteucci et al [J. 
20 Am. Chem. Soc. (1981) 103:3185], or according to Urdea et al [Proc. Natl. Acad. Sci. USA (1983) 
80: 7461], or using commercially available automated oligonucleotide synthesizers. 

The chemical nature of the probe can be selected according to preference. For certain applications, 
DNA or RNA are appropriate. For other applications, modifications may be incorporated eg. 
backbone modifications, such as phosphorothioates or methylphosphonates, can be used to increase 
25 in vivo half-life, alter RNA affinity, increase nuclease resistance etc. [eg. see Agrawal & Iyer 
(1995) Curr Opin Biotechnol 6:12-19; Agrawal (1996) TIBTECH 14:376-387]; analogues such as 
peptide nucleic acids may also be used [eg. see Corey (1997) TIBTECH 15:224-229; Buchardt et 
al (1993) TIBTECH 1 1 :384-386]. 
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Alternatively, the polymerase chain reaction (PCR) is another well-known means for detecting 
small amounts of target nucleic acids. The assay is described in: Mullis et al [Meth. Enzymol 
(1987) 155: 335-350]; US patents 4,683,195 and 4,683,202. Two "primer" nucleotides hybridize 
with the target nucleic acids and are used to prime the reaction. The primers can comprise sequence 
5 that does not hybridize to the sequence of the amplification target (or its complement) to aid with 
duplex stability or, for example, to incorporate a convenient restriction site. Typically, such 
sequence will flank the desired Neisserial sequence. 

A thermostable polymerase creates copies of target nucleic acids from the primers using the 
original target nucleic acids as a template. After a threshold amount of target nucleic acids are 
10 generated by the polymerase, they can be detected by more traditional methods, such as Southern 
blots. When using the Southern blot method, the labelled probe will hybridize to the Neisserial 
sequence (or its complement). 

Also, mRNA or cDNA can be detected by traditional blotting techniques described in Sambrook et 
al [supra], mRNA, or cDNA generated from mRNA using a polymerase enzyme, can be purified 
15 and separated using gel electrophoresis. The nucleic acids on the gel are then blotted onto a solid 
support, such as nitrocellulose. The solid support is exposed to a labelled probe and then washed to 
remove any unhybridized probe. Next, the duplexes containing the labeled probe are detected. 
Typically, the probe is labelled with a radioactive moiety. 

BRIEF DESCRIPTION OF THE DRAWINGS 

20 Figures 1-20 show biochemical data obtained in the Examples, and also sequence analysis, for 
ORFs 37 (Fig. 1A-1E) , 5 (Fig. 2A-2B) , 2 (Fig. 3A-3DV 15 (Fig. 4A-4C) , 22 (Fig. 5A-5Q , 28 £Rg, 
6A-6B) , 32 (Fig. 7A-7B) , 4 (Fig. 8A-8F) , 61 (Fig. 9) , 76 (Fig. 10A-10C) . 89 (Fig, in . 97 {Kg, 
12A-12E) , 106 (Fig. 13A-7Q . 138 (Fig. 14A-B) . 23 (Fig. 15A-15Q . 25 (Fig. 16A-16E) . 27 (Kg, 
17A-17B . 79 (Fig. 18A-18B) . 85 (Fig. 19A-19D) and 132 (Fig. 20A-20C) . Ml and M2 are 

25 molecular weight markers. Arrows indicate the position of the main recombinant product or, in 
Western blots, the position of the main N. meningitidis immunoreactive band. TP indicates 
N. meningitidis total protein extract; OMV indicates N. meningitidis outer membrane vesicle 
preparation. In bactericidal assay results: a diamond (♦) shows preimmune data; a triangle (A) 
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shows GST control data; a circle ( ) shows data with recombinant N. meningitidis protein. 
Computer analyses show a hydrophilicity plot (upper), an antigenic index plot (middle), and an 
AMPHI analysis (lower). The AMPHI program has been used to predict T-cell epitopes [Gao et al 
(1989) J. Immunol J43:3007; Roberts et al. (1996) AIDS Res Hum Retrovir 12:593; Quakyi et al 
5 (1992) Scand J Immunol suppl.l 1:9) and is available in the Protean package of DNASTAR, Inc. 
(1228 South Park Street, Madison, Wisconsin 53715 USA). 

Figure 21 shows an alignment comparison of amino acid sequences for ORF 4 for several strains 
of Neisseria. Dark shading indicates regions of homology, and gray shading indicates the 
conservation of amino acids with similar characteristics. The Figure demonstrates a high degree of 
10 conservation among the various strains, further confirming its utility as an antigen for both 
vaccines and diagnostics. 

EXAMPLES 

The examples describe nucleic acid sequences which have been identified in N. meningitidis, along 
with their putative translation products, and also those of N. gonorrhoeae. Not all of the nucleic 
15 acid sequences are complete ie. they encode less than the full-length wild-type protein. 

The examples are generally in the following format: 

• a nucleotide sequence which has been identified in TV. meningitidis (strain B) 

• the putative translation product of this sequence 

• a computer analysis of the translation product based on database comparisons 

• corresponding gene and protein sequences identified in N. meningitidis (strain A) and in 
N. gonorrhoeae 

• a description of the characteristics of the proteins which indicates that they might be 
suitably antigenic 

• results of biochemical analysis (expression, purification, ELISA, FACS etc.) 
The examples typically include details of sequence identity between species and strains. Proteins 
that are similar in sequence are generally similar in both structure and function, and the sequence 
identity often indicates a common evolutionary origin. Comparison with sequences of proteins of 
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known function is widely used as a guide for the assignment of putative protein function to a new 
sequence and has proved particularly useful in whole-genome analyses. 

Sequence comparisons were performed at NCBI (http://www.ncbi.nlm.nih.gov) using the 
algorithms BLAST, BLAST2, BLASTn, BLASTp, tBLASTn, BLASTx, & tBLASTx [eg. see also 
5 Altschul et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database 
search programs. Nucleic Acids Research 25:2289-3402]. Searches were performed against the 
following databases: non-redundant GenBank+EMBL+DDBJ+PDB sequences and non-redundant 
GenBank CDS translations+PDB+SwissProt+SPupdate+PIR sequences. 

To compare Meningococcal and Gonococcal sequences, the tBLASTx algorithm was used, as 
10 implemented at http://www.genome.ou.edu/gono_blast.html. The FASTA algorithm was also used 
to compare the ORFs (from GCG Wisconsin Package, version 9.0). 

Dots within nucleotide sequences {eg. position 495 in SEQ ID NO: 11) represent nucleotides which 
have been arbitrarily introduced in order to maintain a reading frame. In the same way, double- 
underlined nucleotides were removed. Lower case letters {eg. position 496 in SEQ ID NO: 11) 
15 represent ambiguities which arose during alignment of independent sequencing reactions (some of 
the nucleotide sequences in the examples are derived from combining the results of two or more 
experiments). 

Nucleotide sequences were scanned in all six reading frames to predict the presence of 
hydrophobic domains using an algorithm based on the statistical studies of Esposti et al [Critical 
20 evaluation of the hydropathy of membrane proteins (1990) Eur J Biochem 190:207-219]. These 
domains represent potential transmembrane regions or hydrophobic leader sequences. 

Open reading frames were predicted from fragmented nucleotide sequences using the program 
ORF 



FINDER (NCBI). 
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Underlined amino acid sequences indicate possible transmembrane domains or leader sequences in 
the ORFs, as predicted by the PSORT algorithm (http://www.psort.nibb.ac.jp). Functional domains 
were also predicted using the MOTIFS program (GCG Wisconsin & PROSITE). 

Various tests can be used to assess the in vivo immunogencity of the proteins identified in the 
5 examples. For example, the proteins can be expressed recombinantly and used to screen patient 
sera by immunoblot. A positive reaction between the protein and patient serum indicates that the 
patient has previously mounted an immune response to the protein in question ie. the protein is an 
immunogen. This method can also be used to identify immunodominant proteins. 

The recombinant protein can also be conveniently used to prepare antibodies eg. in a mouse. These 
10 can be used for direct confirmation that a protein is located on the cell-surface. Labelled antibody 
(eg. fluorescent labelling for FACS) can be incubated with intact bacteria and the presence of label 
on the bacterial surface confirms the location of the protein. 

In particular, the following methods (A) to (S) were used to express, purify and biochemically 
characterise the proteins. of the invention: 

15 A) Chromosomal DNA preparation 

N. meningitidis strain 2996 was grown to exponential phase in 100ml of GC medium, harvested by 
centrifugation, and resuspended in 5ml buffer (20% Sucrose, 50mM Tris-HCl, 50mM EDTA, 
pH8). After 10 minutes incubation on ice, the bacteria were lysed by adding 10ml lysis solution 
(50mM NaCl, 1% Na-Sarkosyl, 50fig/ml Proteinase K), and the suspension was incubated at 37°C 
20 for 2 hours. Two phenol extractions (equilibrated to pH 8) and one ChCl 3 /isoamylalcohol (24:1) 
extraction were performed. DNA was precipitated by addition of 0.3M sodium acetate and 2 
volumes ethanol, and was collected by centrifugation. The pellet was washed once with 70% 
ethanol and redissolved in 4ml buffer (lOmM Tris-HCl, ImM EDTA, pH 8). The DNA 
concentration was measured by reading the OD at 260 nm. 

25 B) Oligonucleotide design 

Synthetic oligonucleotide primers were designed on the basis of the coding sequence of each ORF, 
using (a) the meningococcus B sequence when available, or (b) the gonococcus/meningococcus A 
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sequence, adapted to the codon preference usage of meningococcus as necessary. Any predicted 
signal peptides were omitted, by deducing the 5'-end amplification primer sequence immediately 
downstream from the predicted leader sequence. 

For most ORFs, the 5' primers included two restriction enzyme recognition sites (BamHl-Ndel, 
5 BamHl-Nhel, or EcoRl-Nhel, depending on the gene's own restriction pattern); the 3' primers 
included a Xhol restriction site. This procedure was established in order to direct the cloning of 
each amplification product (corresponding to each ORF) into two different expression systems: 
pGEX-KG (using either BamHl-Xhol or EcoRl-XhoT), and pET21b+ (using either Ndel-Xhol or 
Nhel-Xhol). 

10 5'-end primer tail: CGC GGATCCCATATG ( SEP ID NO: 1099) (BamHl-Ndel) 

CGCGGATCCGCTAGC ( SEP ID NO: 1100) {BamHl-Nhel) 
CCG GAATTC T AGCTAGC (SEP ID NO: 1101) (EcoRl-NheT) 
3'-end primer tail: CCCG CTCGAG (SEGIDNP: 1102) (Xhol) 

For PRFs 5, 15, 17, 19, 20, 22, 27, 28, 65 & 89, two different amplifications were performed to 
15 clone each PRF in the two expression systems. Two different 5' primers were used for each ORF; 
the same 3' Xhol primer was used as before: 

5'-end primer tail: GGAATTC CATATG GCCATGG (SEP ID NO: 1103) (Ndel) 

5'-end primer tail: CGGGATCC (BamHl) 

PRF 76 was cloned in the pTRC expression vector and expressed as an amino-terminus His-tag 
20 fusion. In this particular case, the predicted signal peptide was included in the final product. Nhel- 
BamHl restriction sites were incorporated using primers: 

5'-end primer tail: GATCA GCTAGC CATATG (SEP ID NO: 1104) (Nhel) 

3'-end primer tail: CG GGATCC (BamHl) 



As well as containing the restriction enzyme recognition sequences, the primers included 
25 nucleotides which hybridizeed to the sequence to be amplified. The number of hybridizing 
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nucleotides depended on the melting temperature of the whole primer, and was determined for each 
primer using the formulae: 



T m = 4 (G+Q+ 2 (A+T) (tail excluded) 

T m = 64.9 + 0.41 (% GC) - 600/N (whole primer) 

5 The average melting temperature of the selected oligos were 65-70°C for the whole oligo and 
50-55°C for the hybridising region alone. 



Table I shows the forward and reverse primers used for each amplification. In certain cases, it will 
be noted that the sequence of the primer does not exactly match the sequence in the ORF. When 
initial amplifications were performed, the complete 5' and/or 3' sequence was not known for some 

10 meningococcal ORFs, although the corresponding sequences had been identified in gonococcus. 
For amplification, the gonococcal sequences could tfhus be used as the basis for primer design, 
altered to take account of codon preference. In particular, the following codons were changed: 
ATA->ATT; TCG->TCT; CAG->CAA; AAG->AAA; GAG->GAA; CGA->CGC; CGG->CGC; 
GGG^GGC. Italicised nucleotides in Table I indicate such a change. It will be appreciated that, 

15 once the complete sequence has been identified, this approach is generally no longer necessary. 



TABLE I - PCR primers 



ORF 


Primer 


Sequence 


Restriction sites 


ORF1 


Forward 
Reverse 


CGCGGATCCGCTAGC - GGACACACTTATTTCGG [<SEQ ID 
924>] (SEQ ID NO:924) 

CCCGCTCGAG- CCAGCGGTAGCCTAATT [<SEQ ID 92 5>] 
(SEQ ID NO: 925) 


BamHI-Nhel 
Xhol 


ORF 2 


Forward 
Reverse 


GCGGATCCCATATG - TTTGATTTCGGTTTGGG [ < SEQ I D 
926>] (SEQ ID NO: 926) 

CCCGCTCGAG - GACGGCATAACGGCG [<SEQ ID 927 >] 
(SEQ ID NO: 927) 


BamHI-Ndel 
Xhol 


ORF 2-1 


Forward 
Reverse 


GCGGATCCCATATG -TTTGATTTCGGTTTGGG [<SEQ ID 
928>] (SEQ ID NO: 928) 

CCCGCTCGAG- TGATTTACGGACGCGCA [<SEQ ID 929>] 
(SEQ ID NO: 929) 


BamHI-Ndel 
Xhol 


ORF 4 


Forward 
Reverse 


GCGGATCCCATATG - TGCGGAGGTCAAAAAGAC [<SEQ ID 
930>] (SEQ ID NO: 930) 

CCCGCTCGAG- TTTGGCTGCGCCTTC [<SEQ ID 931>] 


BamHI-Ndel 
Xhol 
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(SEQ ID NO: 931) 




ORF5 


Forward 
Forward 


GGAATTCCATATGGCCATGG - TGGAAGGCGCACAACC [ < SEQ 
ID 932>] (SEQ ID NO: 932) 

CGGGATCC - ATGGAAGGCGCACAAC [<SEQ ID 933>] 
(SEQ ID NO: 933) 


Ndel-Ncol 
BamHI 




Reverse 


CCCGCTCGAG - GACTGTGCAAAAACGG [<SEQ ID 934 >] 
(SEQ ID NO: 934) 


Xhol 


ORF6 


Forward 
Reverse 


CGCGGATCCCATATG-ACCCGTCAATCTCTGCA [<SEQ ID 
935>] (SEQ ID NO: 935) 

CCCGCTCGAG -TGCGCCGAACACTTTC [<SEQ ID 936 >] 
(SEQ ID NO: 936) 


BamHI-Ndel 
Xhol 


ORF7 


Forward 
Reverse 


CGCGGATCCGCTAGC - GCGCTGCTTTTTGTTCC [<SEQ ID 
937>] (SEQ ID NO: 937) 

CCCGCTCGAG- TTTCAAAATATATTTGCGGA [<SEQ ID 
938>] (SEQ ID NO: 938) 


BamHI-Nhel 
Xhol 


ORF 8 


Forward 
Reverse 


GCGGATCCCATATG -GCTCAACTGCTTCGTAC [<SEQ ID 
939>] (SEQ ID NO: 939) 

CCCGCTCGAG -AGCAGGCTTTGGCGC [<SEQ ID 940>] 
(SEQ ID NO: 940) 


BamHI-Ndel 
Xhol 


ORF9 


Forward 
Reverse 


CGCGGATCCCATATG- CCGAAGGAAGTCGGAAA [<SEQ ID 
941>] (SEQ ID NO: 941) 

CCCGCTCGAG -TTTCCGAGGTTTTCGGG [<SEQ ID 942 >] 
(SEQ ID NO: 942) 


BamHI-Ndel 
Xhol 


ORF 10 


Forward 
Reverse 


GCGGATCCCATATG - GACACAAAAGAAATCCTC [<SEQ ID 
943>] (SEQ ID NO: 943) 

CCCGCTCGAG- TAATGGGAAACCTTGTTTT [<SEQ ID 
944>] (SEQ ID NO: 944) 


BamHI-Ndel 
Xhol 


ORF11 


Forward 
Reverse 


GCGGATCCCATATG - GCGGTCAACCTCTACG [<SEQ ID 
945>] (SEQ ID NO: 945) 

CCCGCTCGAG -GGAAACGACTTCGCC [<SEQ ID 946 >] 
(SEQ ID NO: 946) 


BamHI-Ndel 
Xhol 


ORF 13 


Forward 
Reverse 


CGCGGATCCCATATG - GCTCTGCTTTCCGCGC [ < SEQ I D 
947>] (SEQ ID NO: 947) 

CCCGCTCGAG -AGGGTGTGTGATAATAAG [<SEQ ID 
948>] (SEQ ID NO: 948) 


BamHI-Ndel 
Xhol 


ORF IS 


Forward 
Forward 


GGAATTCCATATGGCCATGG - GCGGGACACTGACAG I < SEQ 
ID 949>] (SEQ ID NO: 949) 

CGGGATCC -TGCGGGACACTGACAGG [<SEQ ID 950 >] 
(SEQ ID NO: 950) 


Ndel-Ncol 
BamHI 




Reverse 


CCCGCTCGAG -AGGTTGGCCTTGTCTATG [<SEQ ID 
951>] (SEQ ID NO: 951) 


Xhol 


ORF 17 


Forward 


GGAATTCCATATGGCCATGG - TTGCCGGCCTGTTCG [<SEQ 
ID 952>] (SEQ ID NO: 952) 


Ndel-Ncol 
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Forward 


CGGGATCC - ATTGCCGGCCTGTTCG [<SEQ ID 953 >] 
(SEQ ID NO: 953) 


BamHI 




Reverse 


CCCGCTCGAG - AAGCAGGTTGTACAGC [<SEQ ID 954 >] 
(SEQ ID NO: 954) 


Xhol 


ORF 18 


Forward 
Reverse 


GCGGATCCCATATG-ATTTTGCTGCATTTGGAT [<SEQ ID 
955>] (SEQ ID NO: 955) 

CCCGCTCGAG -TCTTCCAATTTCTGAAAGC [<SEQ ID 
956>] (SEQ ID NO: 956) 


BamHI-Ndel 
Xhol 


ORF 19 


Forward 
Forward 


GGAATTCCATATGGCCATGG - TCGCCAGTGTTTTTACC 

[<SEQ ID 957>] (SEQ ID NO: 957) 
CGGGATCC -TTCGCCAGTGTTTTTACCG [<SEQ ID 958 >] 

(SEQ ID NO: 958) 


Ndel-Ncol 
BamHI 




Reverse 


CCCGCTCGAG -GGTGTTTTTGAAGCTGCC [<SEQ ID 
959>] (SEQ ID NO: 959) 


Xhol 


ORF 20 


Forward 
Forward 


GGAATTCCATATGGCCATGG - TCGGCGCGGGTATG [ < SEQ 
ID 960>] (SEQ ID NO: 960) 

CGGGATCC -TTCGGCGCGGGTATG [<SEQ ID 961>] 
(SEQ ID NO: 961) 


Ndel-Ncol 
BamHI 




Reverse 


CCCGCTCGAG- CGGCGAGCGAGAGCA [<SEQ ID 962 >] 
(SEQ ID NO: 962) 


Xhol 


ORF 22 


Forward 

Forward 

i vji vvaiu 

Reverse 


GGAATTCCATATGGCCATGG - TGATTAAAATCAAAAAAGGTCT 
[<SEQ ID 963>] (SEQ ID NO: 963) 
CGGGATCC -ATGATTAAAATCAAAAAAGGTCTAAACC [ < SEQ 
ID 964>] (SEQ ID NO: 964) 

CCCGCTCGAG -ATTATGATAGCGGCCC [<SEQ ID 965>] 
(SEQ ID NO: 965) 


Ndel-Ncol 

BamHI 

Xhol 


ORF 23 


Forward 
Reverse 


CGCGGATCCCATATG-GATGTTTCTGTTTCAGAC [<SEQ ID 
966>] (SEQ ID NO: 966) 

CCCGCTCGAG- TTTAAACCGATAGGTAAACG [<SEQ ID 
967>] (SEQ ID NO: 967) 


BamHI-Ndel 
Xhol 


ORF 24 


Forward 
Forward 


GGAATTCCATATGGCCATGG - TGATGCCGGAAATGGTG 

[<SEQ ID 968>] (SEQ ID NO: 968) 
CGGGATCC -ATGATGCCGGAAATGGTG [<SEQ ID 969>] 

(SEQ ID NO: 969) 


Ndel-Ncol 
BamHI 




Reverse 


CCCGCTCGAG- TGTCAGCGTGGCGCA [<SEQ ID 970>] 
(SEQ ID NO: 970) 


Xhol 


ORF 25 


Forward 
Reverse 


GCGGATCCCATATG-TATCGCAAACTGATTGC [<SEQ ID 
971>] (SEQ ID NO: 971) 

CCCGCTCGAG -ATCGATGGAATAGCCG [<SEQ ID 972 >] 
(SEQ ID NO: 972) 


ir% TTT MJ^T 

BamHI-Ndel 
Xhol 


ORF 26 


Forward 
Reverse 


GCGGATCCCATATG - CAGCTGATCGACTATTC [<SEQ ID 
973>] (SEQ ID NO: 973) 

CCCGCTCGAG -GACATCGGCGCGTTTT [<SEQ ID 974 >] 
(SEQ ID NO: 974) 


BamHI-Ndel 
Xhol 
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ORF27 


Forward 
Forward 
Reverse 


GGAATTCCATATGGCCATGG - AGACCTATTCTGTTTA [<SEQ 
ID 974>] (SEQ ID NO: 1168) 

CGGGATCC- CAGACCTATTCTGTTTATTTTAATC [<SEQ 
ID 975>] (SEQ ID NO: 975) 

CCCGCTCGAG- GGGTTCGATTAAATAACCAT [<SEQ ID 
976>] (SEQ ID NO: 976) 


Ndel-Ncol 

BamHI 

Xhol 


ORF28 


Forward 
Forward 


GGAATTCCATATGGCCATGG - ACGGCTGTACGTTGATGT 
[<SEQ ID 977>] (SEQ ID NO: 977) 
CGGGATCC -AACGGCTGTACGTTGATG [<SEQ ID 978 >] 
(SEQ ID NO: 978) 


Ndel-Ncol 
BamHI 




Reverse 


CCCGCTCGAG- TTTGTCAGAGGAATTCGCG [<SEQ ID 
979>] (SEQ ID NO: 979) 


Xhol 


ORF 29 


Forward 
Forward 
Reverse 


GCGGATCCCATATG - AACGGTTTGGATGCCCG [<SEQ ID 
980>] (SEQ ID NO: 980) 

CGCGGATCCGCTAGC -AACGGTTTGGATGCCCG [<SEQ ID 
981>] (SEQ ID NO: 981) 

CCCGCTCGAG- TTTGTCTAAGTTCCTGATATG [<SEQ ID 
982>] (SEQ ID NO: 982) 


BamHI-Ndel 
BamHI-Nhel 
Xhol 


ORF32 


Forward 
Reverse 


CGCGGATCCCATATG-AATACTCCTCCTTTTG ( [<SEQ ID 
983>] (SEQ ID NO: 983) 

CCCGCTCGAG - GCGTATTTTTTGATGCTTTG [<SEQ ID 
984>] (SEQ ID NO: 984) 


BamHI-Ndel 
Xhol 


ORF 33 


Forward 
Reverse 


GCGGATCCCATATG - ATTGATAGGGATCGTATG [<SEQ ID- 
985>] (SEQ ID NO: 985) 

CCCGCTCGAG -TTGATCTTTCAAACGGCC [<SEQ ID 
986>] (SEQ ID NO: 986) 


BamHI-Ndel 
Xhol 


ORF 35 


Forward 
Forward 
Reverse 


GCGGATCCCATATG - TTCAGAGCTCAGCTT [<SEQ ID 
987>] (SEQ ID NO: 987) 

CGCGGATCCGCTAGC -TTCAGAGCTCAGCTT [<SEQ ID 
988>] (SEQ ID NO: 988) 

CCCGCTCGAG -AAACAGCCATTTGAGCGA [<SEQ ID 
989>] (SEQ ID NO: 989) 


BamHI-Ndel 
BamHI-Nhel 
Xhol 


ORF 37 


Forward 
Reverse 


GCGGATCCCATATG - GATGACGTATCGGATTTT [<SEQ ID 
990>] (SEQ ID NO: 990) 

CCCGCTCGAG- ATAGCCCGCTTTCAGG [<SEQ ID 991 >] 
(SEQ ID NO: 991) 


BamHI-Ndel 
Xhol 


LPKJr 5o 


rorwaro 
Reverse 


CGCGGATCCGCTAGC - TCCGAACGCGAGTGGAT [<SEQ ID 
992>] (SEQ ID NO: 992) 

CCCGCTCGAG- AGCATTGTCCAAGGGGAC [<SEQ ID 
993>] (SEQ ID NO: 993) 


RamHT-Nhel 
Xhol 


ORF 65 


Forward 


GGAATTCCATATGGCCATGG - TGCTGTATCTGAATCAAG 
[<SEQ ID 994>] (SEQ ID NO: 994) 


Ndel-Ncol 




Forward 


CGGGATCC -TTGCTGTATCTGAATCAAGG [<SEQ ID 


BamHI 
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Reverse 


995>] (SEQ ID NO: 995) 

CCCGCTCGAG- CCGCATCGGCAGACA [<SEQ ID 996>] * 
(SEQ ID NO: 996) 


Xhol 


ORF 66 


Forward 
Reverse 


GCGGATCCCATATG - TACGCATTTACCGCCG [<SEQ ID 
997>] (SEQ ID NO: 997) 

CCCGCTCGAG - TGGATTTTGCAGAGATGG [<SEQ ID 
998>] (SEQ ID NO: 998) 


BamHI-Ndel 
Xhol 


ORF 72 


Forward 

1 V/l TV (41 VJ 

Reverse 


CGCGGATCCCATATG - AATGCAGTAAAAATATCTGA [<SEQ 
ID 999>] (SEQ ID NO: 999) 
CCCGCTCGAG - GCCTGAGACCTTTGCAA [ < SEQ I D 
1000>] (SEQ ID NO: 1000) 


BamHI-Ndel 
Xhol 


ORF 71 


Forward 

1 \Jl W (41 U 

Reverse 


GCGGATCCCATATG -AGATTTTTCGGTATCGG [<SEQ ID 
1001>] (SEQ ID NO: 1001) 

CCCGCTCGAG - TTCATCTTTTTCATGTTCG [<SEQ ID 
1002>] (SEQ ID NO: 1002) 


BamHI-Ndel 
Xhol 


ORF 75 


Forward 
Reverse 


GCGGATCCCATATG- TCTGTCTTTCAAACGGC [<SEQ ID 
1003>] (SEQ ID NO: 1003) 

CCCGCTCGAG - TTTGTTTTTGCAAGACAG [<SEQ ID 
1004>] (SEQ ID NO: 1004) 


BamHI-Ndel 
Xhol 


ORF 76 


Forward 
Reverse 


GATCAGCTAGCCATATG - AAACAGAAAAAAACCGC [ <SEQ 
ID 1005>] (SEQ ID NO: 1005) 

CGGGATCC - TTACGGTTTGACACCGTT [<SEQ ID 1006>] 
(SEQ ID NO: 1006) 


Nhel-Ndel 
BamHI 


ORF 79 


Forward 
Reverse 


CGCGGATCCCATATG -GTTTCCGCCGCCG [<SEQ ID 
1007>] (SEQ ID NO: 1007) 

CCCGCTCGAG - GTGCTGATGCGCTTCG [<SEQ ID 1008 >] 
(SEQ ID NO: 1008) 


BamHI-Ndel 
Xhol 


ORF 8^ 


Forward 

1 V/l W (41 \J 

Reverse 


GCGGATCCCATATG -AAAACCCTGCTGCTGC [<SEQ ID 
1009>] (SEQ ID NO: 1009) 

CCCGCTCGAG -GCCGCCTTTGCGGC [<SEQ ID 1010>] 
(SEQ ID NO: 1010) 


BamHI-Ndel 
Xhol 


ORF 84 


Forward 
Reverse 


GCGGATCCCATATG - GCAGAGATCTGTTTG [<SEQ ID 
1011>] (SEQ ID NO: 1011) 
CCCGCTCGAG -GTTTGCCGATCCGACCA [<SEQ ID 
1012>] (SEQ ID NO: 1012) 


BamHI-Ndel 
Xhol 


ORF 85 


Forward 
Reverse 


CGCGGATCCCATATG- GCGG1 1 L<bnU ID 
1013>] (SEQ ID NO: 1013) 

CCCGCTCGAG-TCGGCGCGGCGGGC [<SEQ ID 1014 >] 
(SEQ ID NO: 1014) 


BamHl-JNael 
Xhol 


ORF 89 


Forward 
Forward 


GGAATTCCATATGGC C ATGG - CCATACCTTCTTATCA [ < SEQ 
ID 1015>] (SEQ ID NO: 1015) 
CGGGATCC -GCCATACCTTCTTATCAGAG [<SEQ ID 
1016>] (SEQ ID NO: 1016) 


Ndel-Ncol 
BamHI 
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CCCGCTCGAG -TTTTTTGCGATTAGAAAAAGC [<SEQ ID 
1017>] (SEQ ID NO: 1017) 


Xhol 


ORF 97 


Forward 


GCGGATCCCATATG - CATCCTGCCAGCGAAC [<SEQ ID 
1018>] (SEQ ID NO: 1018) 


BamHI-Ndel 




Reverse 


CCCGCTCGAG - TTCGCCTACGGTTTTTTG [ < SEQ ID 
1019>] (SEQ ID NO: 1019) 


Xhol 


ORF 98 


Forward 


GCGGATCCCATATG -ACGGTAACTGCGG [<SEQ ID 
1020>] (SEQ ID NO: 1020) 


BamHI-Ndel 




Reverse 


CCCGCTCGAG - TTGTTGTTCGGGCAAATC [<SEQ ID 
1021>] (SEQ ID NO: 1021) 


Xhol 


ORF 100 


Forward 


GCGGATCCCATATG - TCGGGCATTTACACCG [<SEQ ID 
1022>] (SEQ ID NO: 1022) 


BamHI-Ndel 




Reverse 


CCCGCTCGAG -ACGGGTTTCGGCGGAA [<SEQ ID 1023 >] 
(SEQ ID NO: 1023) 


Xhol 


ORF 101 


Forward 


GCGGATCCCATATG -ATTTATCAAAGAAACCTC [<SEQ ID 
1024>] (SEQ ID NO: 1024) 


BamHI-Ndel 




Reverse 


CCCGCTCGAG -TTTTCCGCCTTTCAATGT [<SEQ ID 
1025>] (SEQ ID NO: 1025) 


Xhol 


ORF 102 


Forward 


GCGGATCCCATATG -GCAGGGCTGTTTTACC [<SEQ ID 
1026>] (SEQ ID NO: 1026) 


BamHI-Ndel 




Reverse 


CCCGCTCGAG -AAACGGTTTGAACACGAC [<SEQ ID 
1027>] (SEQ ID NO: 1027) 


Xhol 


ORF 103 


Forward 


GCGGATCCCATATG - AACCACGACATCAC [ < SEQ ID 
1028>] (SEQ ID NO: 1028) 


BamHI-Ndel 




Reverse 


CCCGCTCGAG- CAGCCACAGGACGGC [<SEQ ID 1029>] 
(SEQ ID NO: 1029) 


Xhol 


ORF 104 


Forward 


GCGGATCCCATATG -ACGTGGGGAACGC [<SEQ ID 
1030>] (SEQ ID NO: 1030) 


BamHI-Ndel 




Reverse 


CCCGCTCGAG - GCGGCGTTTGAACGGC [<SEQ ID 1031 >] 
(SEQ ID NO: 1031) 


Xhol 


ORF 105 


Forward 


GCGGATCCCATATG- ACCAAATTTCAAACCCCTC [<SEQ ID 
1032>] (SEQ ID NO: 1032) 


BamHI-Ndel 




Reverse 


CCCGCTCGAG -TAAACGAATGCCGTCCAG [<SEQ ID 
1033>] (SEQ ID NO: 1033) 


Xhol 


ORF 106 


Forward 


GCGGATCCCATATG -AGGATAACCGACGGCG [<SEQ ID 
1034>] (SEQ ID NO: 1034) 


BamHI-Ndel 




Reverse 


CCCGCTCGAG- TTTGTTCCCGATGATGTT [<SEQ ID 
1035>] (SEQ ID NO: 1035) 


Xhol 


ORF 109 


Forward 


GCGGATCCCATATG - GAAGATTTATATATAATACTCG [ < SEQ 
ID 1036>] (SEQ ID NO: 1036) 


BamHI-Ndel 




Reverse 


CCCGCTCGAG -ATCAGCTTCGAACCGAAG [<SEQ ID 
1037>] (SEQ ID NO: 1037) 


Xhol 
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AH171 in 

UKr 11U 


rorwarQ 
Reverse 


AAAGAATTC - ATGAGTAAATCCCGTAGATCTCCC [ < SEQ ID 
1038>] (SEQ ID NO: 1038) 

AAACTGCAG - GGAAAACCACATCCGCACTCTGCC [<SEQ ID 
1039>] (SEQ ID NO: 1039) 


PstI 


UKr 111 


L r"\ t*lll *1 

rorwara 
Reverse 


AAAGAATTC -GCACCGCAAAAGGCAAAAACCGCA [<SEQ ID 
1040>] (SEQ ID NO: 1040) 

AAACTGCAG - TCTGCGCGT TTTCGGGCAGGGTGG [ < S EQ ID 
1041>] (SEQ ID NO: 1041) 


FrnRT 

PstI 


ORF113 


Forward 
Reverse 


AAAGAATTC - ATGAACAAAACCCTCTATCGTGTGATTTTCAAC 
rn \ <<mO ID 1042>1 (SEO ID NO* 1042) 
AAACTGCAG - TTACGAATGCCTGCTTGCTCGACCGTACTG 
[<SEQ ID 1043>] (SEQ ID NO: 1043) 


EcoRI 
PstI 


ORF115 


Forward 
Reverse 


AAAGAATTC- TTGCTTGTGCAAACAGAAAAAGACGG [ <SEQ 
TD 1044^1 f^RO ID NO* 1044) 
AAAAAAGTCGAC - 

CTATTTTTTAGGGGC 7TTTGC 7TGTTTGAAAAGCCTGCC [<SEQ ID 
1045>] (SEQ ID NO: 1045) 


EcoRI 
Sail 


ADF1 1 A 


Forward 
Reverse 


AAAfiAATTP- T AC AAC ATGTAT C AGG AAAAC CAAT AC CG 
[<SEQ ID 1046>] (SEQ ID NO: 1046) 

AAACTGCAG - TTATGAAAACAGGCGCAGGGCGGTTTTGCC 
[<SEQ ID 1047>] (SEQ ID NO: 1047) 


PstI 


OKr 12U 


Forward 
Reverse 


AAAnAATTr-fiPAAGnCTACCCCAATCCGCCGTG T<SEO ID 
1048>] (SEQ ID NO: 1048) 

AAACTGCAG- CGGTTTGGCTGCCTGGCCGTTGAT [<SEQ ID 
1049>] (SEQ ID NO: 1049) 


PstI 


(JKr 121 


Forward 
Reverse 


AAAfiAATTr-nfCTTGGTCTGGCTGGTTTTCGC F<SEO ID 
1050>] (SEQ ID NO: 1050) 

AAACTGCAG -TCATCCGCCACCCCACCTCGGCCATCCATC 
[<SEQ ID 1051>] (SEQ ID NO: 1051) 


PstI 


OKr 122 


Forward 
Reverse 


AAAAAAGTCGAC - ATGTC 7TACCG CGCAAGCAGTTC !TCC 
[<SEQ ID 1052>] (SEQ ID NO: 1052) 

AAACTGCAG - TCAGGAACACAAACGATGACGAATATCCGTATC 
[<SEQ ID 1053>] (SEQ ID NO: 1053) 


Jan 

PstI 




rui WalU 

Reverse 


AAAGAATTC - GCGCTGTTTTTTGCGGCGGCGTAT [ < SEQ ID 
1054>] (SEQ ID NO: 1054) 

AAACTGCAG - CGCCGTTTCAAGACGAAAAAGTCG [ < S EQ ID 


EcoRI 
PstI 


ORF126 


Forward 


AAAGAATTC - GCGGAAACGGTCGAAG [<SEQ ID 1056 >] 
(SEQ ID NO: 1056) 


EcoRI 




Reverse 


AAACTGCAG- TTAATCTTGTCTTCCGATATAC [<SEQ ID 
1057>] (SEQ ID NO: 1057) 


PstI 


ORF127 


Forward 


AAAGAATTC - ATGACTGATAATCGGGGGTTTACG [<SEQ ID 
1058>] (SEQ ID NO: 1058) 


EcoRI 
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Reverse 


AAAAAAGTCGAC - CTTAAGTAACTTGCAGTCCTTATC [ < SEQ 
ID 1059>] (SEQ ID NO: 1059) 


Sail 


ORF1 28 


Forward 

X Vy I TV 111 VI 

Reverse 


AAAGAATTC - ATGCAAGCTGTCCG CTACAGGCC [<SEQ ID 
1060>] (SEQ ID NO: 1060) 

AAACTGCAG-CTA7TGCAATGCGCCGCCGCGGGAATGTTTGAGCAGGC 
G [<SEQ ID 1061>] (SEQ ID NO: 1061) 


EcoRI 
PstI 


ORF129 


Forward 
Reverse 


AAAGAATTC - ATGGATTTTCGTTTTGACATTATTTACGAATAC 
CG [<SEO ID 1062>] (SEQ ID NO: 1062) 
AAACTGCAG - TTATTTTTTGATGAAATTTTGGGGCGG [ < SEQ 
ID 1063>] (SEQ ID NO: 1063) 


EcoRI 
PstI 


ORF130 


Forward 
Reverse 


AAAGAATTC -GCAGTACTTGCCATTCTCGGTGCG [<SEQ ID 
1064>] (SEQ ID NO: 1064) 

AAACTGCAG - CTCCGGATCGTCTGTAAACGCATT [<SEQ ID 
1065>] (SEQ ID NO: 1065) 


EcoRI 
PstI 


ORF 131 


Forward 
Reverse 


GCGGATCCCATATG - GAAATTCGGGCAATAAAAT [<SEQ ID 
1066>] (SEQ ID NO: 1066) 

CCCGCTCGAG- CCAGCGGACGCGTTC [<SEQ ID 1067>] 
(SEQ ID NO: 1067) 


BamHI-Ndel 
Xhol 


ORF 132 


Forward 
Reverse 


GCGGATCCCATATG -AAAGAAGCGGGGTTTG [<SEQ ID 
1068>] (SEQ ID NO: 1068) 

CCCGCTCGAG- CCAATCTGCCAGCCGT [<SEQ ID 1069>] 
(SEQ ID NO: 1069.) 


BamHI-Ndel 
Xhol 


ORF 133 


Forward 
Reverse 


CGCGGATCCCATATG - GAAGATGCAGGGCGCG [<SEQ ID 
1070>] (SEQ ID NO: 1070) 
CCCGCTCGAG- AAACTTGTAGCTCATCGT [<SEQ ID 
1071>] (SEQ ID NO: 1071) 


BamHI-Ndel 
Xhol 


ORF 134 


Forward 
Reverse 


GCGGATCCCATATG -TCTGTGCAAGCAGTATTG [<SEQ ID 
1072>] (SEQ ID NO: 1072) 

CCCGCTCGAG -ATCCTGTGCCAATGCG [<SEQ ID 1073 >] 
(SEQ ID NO: 1073) 


BamHI-Ndel 
Xhol 


ORF 135 


Forward 
Reverse 


GCGGATCCCATATG - CCGTCTGAAAAAGCTTT [<SEQ ID 
1074>] (SEQ ID NO: 1074) 
CCCGCTCGAG -AAATACCGCTGAGGATG [<SEQ ID 
1075>] (SEQ ID NO: 1075) 


BamHI-Ndel 
Xhol 


ORF 136 


Forward 
Reverse 


CGCGGATCCGCTAGC- ATGAAGCGGCGIAIAGCC L<bHU 1JJ 
1076>] (SEQ ID NO: 1076) 

CCCGCTCGAG - TTCCGAATATTTGGAACTTTT [<SEQ ID 
1077>] (SEQ ID NO: 1077) 


BamHl-Nnel 
Xhol 


ORF 137 


Forward 
Reverse 


CGCGGATCCCATATG - GGCACGGCGGGAAATA [<SEQ ID 
1078>] (SEQ ID NO: 1078) 

CCCGCTCGAG -ATAACGGTATGCCGCC [<SEQ ID 1079>] 
(SEQ ID NO: 1079) 


BamHI-Ndel 
Xhol 
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ORF 138 


Forward 
Reverse 


GCGGATCCCATATG - TTTCGTTTACAATTCAGGC [ < SEQ ID 
1080>] (SEQ ID NO: 1080) 

CCCGCTCGAG - CGGCGTTTTATAGCGG [<SEQ ID 1081>] 
(SEQ ID NO: 1081) 


BamHI-Ndel 
Xhol 


ORF 139 


Forward 
Reverse 


GCGGATCCCATATG - GCTTTTTTGGCGGTAATG [<SEQ ID 
1082>] (SEQ ID NO: 1082) 
CCCGCTCGAG -TAACGTTTCCGTGCGTTT [<SEQ ID 
1083>] (SEQ ID NO: 1083) 


BamHI-Ndel 
Xhol 


ORF 140 


Forward 
Reverse 


GCGGATCCCATATG- TTGCCCACAGGCAGC [<SEQ ID 
1084>] (SEQ ID NO: 1084) 

CCCGCTCGAG - GACGATGGCAAACAGC [<SEQ ID 1085 >] 
(SEQ ID NO: 1085) 


BamHI-Ndel 
Xhol 


ORF 141 


Forward 
Reverse 


GCGGATCCCATATG - CCGTCTGAAGCAGTCT [ < SEQ ID 
1086>] (SEQ ID NO: 1086) 

CCCGCTCGAG -ATCTGTTGTTTTTAAAATATT [<SEQ ID 
1087>] (SEQ ID NO: 1087) 


BamHI-Ndel 
Xhol 


ORF 142 


Forward 
Reverse 


GCGGATCCCATATG - GATAATTCTGGTAGTGAAG [<SEQ ID 
1088>] (SEQ ID NO: 1088) 
CCCGCTCGAG -AAACGTATAGCCTACCT [<SEQ ID 
1089>] (SEQ ID NO: 1089) 


BamHI-Ndel 
Xhol 


ORF 143 


Forward 
Reverse 


GCGGATCCCATATG -GATACCGCTTTGAACCT [<SEQ ID 
1090>] (SEQ ID NO: 1090) 
CCCGCTCGAG -AATGGCTTCCGCAATATG [<SEQ ID 
1091>] (SEQ ID NO: 1091) 


BamHI-Ndel 
Xhol 


ORF 144 


Forward 
Reverse 


GCGGATCCCATATG -ACCTTTTTACAACGTTTGC [<SEQ ID 
1092^1 (SEO ID NO- 1092) 

CCCGCTCGAG -AGATTGTTGTTGTTTTTTCG [<SEQ ID 
1093>] (SEQ ID NO: 1093) 


BamHI-Ndel 
Xhol 


ORF 147 


Forward 
Reverse 


GCGGATCCCATATG -TCTGTCTTTCAAACGGC [<SEQ ID 
1094>] (SEQ ID NO: 1094) 
CCCGCTCGAG -TTTGTTTTTGCAAGACAG [<SEQ ID 
1095>] (SEQ ID NO: 1095) 


BamHI-Ndel 
Xhol 



NB: 

- restriction sites are underlined 

- for ORFs 110-1 30, where the ORF itself carries an EcoRl site (eg. ORF1 22), a Sail site 
was used in the forward primer instead. Similarly, where the ORF carries a Pstl site (eg. 
ORFs 115 and 1 27), a Sail site was used in the reverse primer. 



CHIR-0160 (356.001 ) PATENT 

-65- 

Oligos were synthesized by a Perkin Elmer 394 DNA/RNA Synthesizer, eluted from the columns 
in 2ml NH4OH, and deprotected by 5 hours incubation at 56°C The oligos were precipitated by 
addition of 0.3M Na- Acetate and 2 volumes ethanol. The samples were then centrifuged and the 
pellets resuspended in either 100(xl or 1ml of water. OD260 was determined using a Perkin Elmer 
5 Lambda Bio spectophoto meter and the concentration was determined and adjusted to 2-10pmol/|il. 

C) Amplification 

The standard PCR protocol was as follows: 50-200ng of genomic DNA were used as a template in 
the presence of 20-40(iM of each oligo, 400-800|LiM dNTPs solution, lx PCR buffer (including 
1.5mM MgCl 2 ), 2.5 units TaqI DNA polymerase (using Perkin-Elmer AmpliTaQ, GIBCO 
10 Platinum, Pwo DNA polymerase, or Tahara Shuzo Taq polymerase). 

In some cases, PCR was optimsed by the addition of 10|il DMSO or 50|Lil 2M betaine. 

After a hot start (adding the polymerase during a preliminary 3 minute incubation of the whole mix 
at 95°C), each sample underwent a double-step amplification: the first 5 cycles were performed 
using as the hybridization temperature the one of the oligos excluding the restriction enzymes tail, 
15 followed by 30 cycles performed according to the hybridization temperature of the whole length 
oligos. The cycles were followed by a final 10 minute extension step at 72°C. 

The standard cycles were as follows: 





Denaturation 


Hybridisation 


Elongation 


First 5 cycles 


30 seconds 
95°C 


30 seconds 
50-55°C 


30-60 seconds 
72°C 


Last 30 cycles 


30 seconds 
95°C 


30 seconds 
65-70°C 


30-60 seconds 
72°C 



The elongation time varied according to the length of the ORF to be amplified. 
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The amplifications were performed using either a 9600 or a 2400 Perkin Elmer GeneAmp PCR 
System. To check the results, 1/10 of the amplification volume was loaded onto a 1-1.5% agarose 
gel and the size of each amplified fragment compared with a DNA molecular weight marker. 

The amplified DNA was either loaded directly on a 1% agarose gel or first precipitated with 
5 ethanol and resuspended in a suitable volume to be loaded on a 1% agarose gel. The DNA 
fragment corresponding to the right size band was then eluted and purified from gel, using the 
Qiagen Gel Extraction Kit, following the instructions of the manufacturer. The final volume of the 
DNA fragment was 30^1 or 50^1 of either water or lOmM Tris, pH 8.5. 

D) Digestion of PCR fragments 

10 The purified DNA corresponding to the amplified fragment was split into 2 aliquots and double- 
digested with: 

- NdellXhol or NheUXhol for cloning into pET-21b+ and further expression of the protein 
as a C-terminus His-tag fusion 

- BamHI/XhoI or EcoRI/XhoI for cloning into pGEX-KG and further expression of the 
15 protein as N-terminus GST fusion. 

- For ORF 76, NheVBamHl for cloning into pTRC-HisA vector and further expression of 
the protein as N-terminus His-tag fusion. 

- EcoRl/Pstl EcoRI/Sall Sall/PstI for cloning into pGex-His and further expression of 
the protein as N-terminus His-tag fusion 

20 Each purified DNA fragment was incubated (37°C for 3 hours to overnight) with 20 units of each 
restriction enzyme (New England Biolabs ) in a either 30 or 40^,1 final volume in the presence of 
the appropriate buffer. The digestion product was then purified using the QIAquick PCR 
purification kit, following the manufacturer's instructions, and eluted in a final volume of 30 or 
50|il of either water or l OmM Tris-HCl, pH 8.5. The final DNA concentration was determined by 

25 1 % agarose gel electrophoresis in the presence of titrated molecular weight marker. 
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E) Digestion of the cloning vectors (pET22B, pGEX-KG, pTRC-His A, and pGex-His) 

lO^Lg plasmid was double-digested with 50 units of each restriction enzyme in 200jll1 reaction 
volume in the presence of appropriate buffer by overnight incubation at 37°C. After loading the 
whole digestion on a 1% agarose gel, the band corresponding to the digested vector was purified 
5 from the gel using the Qiagen QIAquick Gel Extraction Kit and the DNA was eluted in 50^,1 of 
lOmM Tris-HCl, pH 8.5. The DNA concentration was evaluated by measuring OD 2 6o of the 
sample, and adjusted to 50|i,g/|il. l|il of plasmid was used for each cloning procedure. 

The vector pGEX-His is a modified pGEX-2T vector carrying a region encoding six histidine 
residues upstream to the thrombin cleavage site and containing the multiple cloning site of the 
1 0 vector pTRC99 (Pharmacia). 

F) Cloning 

The fragments corresponding to each ORF, previously digested and purified, were ligated in both 
pET22b and pGEX-KG. In a final volume of 20|il, a molar ratio of 3:1 fragment/vector was ligated 
using 0.5|Lil of NEB T4 DNA ligase (400 units/|il), in the presence of the buffer supplied by the 
15 manufacturer. The reaction was incubated at room temperature for 3 hours. In some experiments, 
ligation was performed using the Boheringer "Rapid Ligation Kit", following the manufacturer's 
instructions. 

In order to introduce the recombinant plasmid in a suitable strain, 100|il E. coli DH5 competent 
cells were incubated with the ligase reaction solution for 40 minutes on ice, then at 37°C for 3 
20 minutes, then, after adding 800^1 LB broth, again at 37°C for 20 minutes. The cells were then 
centrifuged at maximum speed in an Eppendorf microfuge and resuspended in approximately 
200(il of the supernatant. The suspension was then plated on LB ampicillin (lOOmg/ml ). 

The screening of the recombinant clones was performed by growing 5 randomly-chosen colonies 
overnight at 37°C in either 2ml (pGEX or pTC clones) or 5ml (pET clones) LB broth + lOO^ig/ml 
25 ampicillin. The cells were then pelletted and the DNA extracted using the Qiagen QIAprep Spin 
Miniprep Kit, following the manufacturer's instructions, to a final volume of 30jxl. 5|il of each 
individual miniprep (approximately lg ) were digested with either Ndel/Xhol or BamHUXhol and 
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the whole digestion loaded onto a 1-1.5% agarose gel (depending on the expected insert size), in 
parallel with the molecular weight marker (1Kb DNA Ladder, GIBCO). The screening of the 
positive clones was made on the base of the correct insert size. 

For the cloning of ORFs 110, 111, 113, 115, 119, 122, 125 & 130, the double-digested PCR 
5 product was ligated into double-digested vector using EcoRl-Pstl cloning sites or, for ORFs 1 15 & 
127, EcoRl-Sall or, for ORF 122, Sall-Pstl. After cloning, the recombinant plasmids were 
introduced in the E.coli host W3110. Individual clones were grown overnight at 37°C in L-broth 
with 50|ll/ml ampicillin. 

G) Expression 

Each ORF cloned into the expression vector was transformed into the strain suitable for expression 
of the recombinant protein product. l|Lil of each construct was used to transform 30jll1 of E.coli 
BL21 (pGEX vector), Kcoli TOP 10 (pTRC vector) or Ecoli BL21-DE3 (pET vector), as 
described above. In the case of the pGEX-His vector, the same E.coli strain (W31 10) was used for 
initial cloning and expression. Single recombinant colonies were inoculated into 2ml LB+Amp 
(lOO^ig/ml), incubated at 37°C overnight, then diluted 1:30 in 20ml of LB+Amp (100[Ag/ml) in 
100ml flasks, making sure that the OD 600 ranged between 0.1 and 0.15. The flasks were incubated 
at 30°C into gyratory water bath shakers until OD indicated exponential growth suitable for 
induction of expression (0.4-0.8 OD for pET and pTRC vectors; 0.8-1 OD for pGEX and pGEX- 
His vectors). For the pET, pTRC and pGEX-His vectors, the protein expression was induced by 
addition of ImM IPTG, whereas in the case of pGEX system the final concentration of IPTG was 
0.2mM. After 3 hours incubation at 30°C, the final concentration of the sample was checked by 
OD. In order to check expression, 1ml of each sample was removed, centrifuged in a microfuge, 
the pellet resuspended in PBS, and analysed by 12% SDS-PAGE with Coomassie Blue staining. 
The whole sample was centrifuged at 6000g and the pellet resuspended in PBS for further use. 

25 H) GST-fusion proteins large-scale purification. 

A single colony was grown overnight at 37°C on LB+Amp agar plate. The bacteria were inoculated 
into 20ml of LB+Amp liquid colture in a water bath shaker and grown overnight. Bacteria were 
diluted 1 :30 into 600ml of fresh medium and allowed to grow at the optimal temperature (20-37°C) 
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to OD 550 0.8-1. Protein expression was induced with 0.2mM BPTG followed by three hours 
incubation. The culture was centrifuged at 8000rpm at 4°C. The supernatant was discarded and the 
bacterial pellet was resuspended in 7.5ml cold PBS. The cells were disrupted by sonication on ice 
for 30 sec at 40W using a Branson sonifier B-15, frozen and thawed twice and centrifuged again. 
5 The supernatant was collected and mixed with 150(0,1 Glutatione-Sepharose 4B resin (Pharmacia) 
(previously washed with PBS) and incubated at room temperature for 30 minutes. The sample was 
centrifuged at 700g for 5 minutes at 4°C. The resin was washed twice with 10ml cold PBS for 10 
minutes, resuspended in 1ml cold PBS, and loaded on a disposable column. The resin was washed 
twice with 2ml cold PBS until the flow-through reached OD 2 8o of 0.02-0.06. The GST-fusion 

10 protein was eluted by addition of 700jll1 cold Glutathione elution buffer (lOmM reduced 
glutathione, 50mM Tris-HCl) and fractions collected until the OD 2 so was 0.1. 21^1 of each fraction 
were loaded on a 12% SDS gel using either Biorad SDS-PAGE Molecular weight standard broad 
range (Ml) (200, 116.25, 97.4, 66.2, 45, 31, 21.5, 14.4, 6.5 kDa) or Amersham Rainbow Marker 
(M2) (220, 66, 46, 30, 21.5, 14.3 kDa) as standards. As the MW of GST is 26kDa, this value must 

15 be added to the MW of each GST-fusion protein. 

I) His-fusion solubility analysis (ORFs 111-129) 

To analyse the solubility of the His-fusion expression products, pellets of 3ml cultures were 
resuspended in buffer Ml [500^1 PBS pH 7.2]. 25[xl lysozyme (lOmg/ml) was added and the 
bacteria were incubated for 15 min at 4°C. The pellets were sonicated for 30 sec at 40W using a 

20 Branson sonifier B-15, frozen and thawed twice and then separated again into pellet and 
supernatant by a centrifugation step. The supernatant was collected and the pellet was resuspended 
in buffer M2 [8M urea, 0.5M NaCl, 20mM imidazole and 0.1M NaH 2 P0 4 ] and incubated for 3 to 
4 hours at 4°C. After centrifugation, the supernatant was collected and the pellet was resuspended 
in buffer M3 [6M guanidinium-HCl, 0.5M NaCl, 20mM imidazole and 0.1M NaH 2 P0 4 ] overnight 

25 at 4°C. The supernatants from all steps were analysed by SDS-PAGE. 

The proteins expressed from ORFs 113, 119 and 120 were found to be soluble in PBS, whereas 
ORFs 111, 122, 126 and 129 need urea and ORFs 125 and 127 need guanidium-HCl for their 
solubilization. 
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J) His-fusion large-scale purification. 

A single colony was grown overnight at 37°C on a LB + Amp agar plate. The bacteria were 
inoculated into 20ml of LB+Amp liquid culture and incubated overnight in a water bath shaker. 
Bacteria were diluted 1:30 into 600ml fresh medium and allowed to grow at the optimal 
5 temperature (20-37°C) to OD 550 0.6-0.8. Protein expression was induced by addition of ImM IPTG 
and the culture further incubated for three hours. The culture was centrifuged at 8000rpm at 4°C, 
the supernatant was discarded and the bacterial pellet was resuspended in 7.5ml of either (i) cold 
buffer A (300mM NaCl, 50mM phosphate buffer, lOmM imidazole, pH 8) for soluble proteins or 
(ii) buffer B (urea 8M, lOmM Tris-HCl, lOOmM phosphate buffer, pH 8.8) for insoluble proteins. 

10 The cells were disrupted by sonication on ice for 30 sec at 40W using a Branson sonifier B-15, 
frozen and thawed two times and centrifuged again. 

For insoluble proteins, the supernatant was stored at -20°C, while the pellets were resuspended in 
2ml buffer C (6M guanidine hydrochloride, lOOmM phosphate buffer, lOmM Tris-HCl, pH 7.5) 
and treated in a homogenizer for 10 cycles. The product was centrifuged at 13000rpm for 40 
15 minutes. 

Supernatants were collected and mixed with \50\x\ Ni 2+ -resin (Pharmacia) (previously washed with 
either buffer A or buffer B, as appropriate) and incubated at room temperature with gentle agitation 
for 30 minutes. The sample was centrifuged at 700g for 5 minutes at 4°C. The resin was washed 
twice with 10ml buffer A or B for 10 minutes, resuspended in 1ml buffer A or B and loaded on a 
20 disposable column. The resin was washed at either (i) 4°C with 2ml cold buffer A or (ii) room 
temperature with 2ml buffer B, until the flow-through reached OD 2 8o of 0.02-0.06. 

The resin was washed with either (i) 2ml cold 20mM imidazole buffer (300mM NaCl, 50mM 
phosphate buffer, 20mM imidazole, pH 8) or (ii) buffer D (urea 8M, lOmM Tris-HCl, lOOmM 
phosphate buffer, pH 6.3) until the flow-through reached the O.D 28 o of 0.02-0.06. The His-fusion 
25 protein was eluted by addition of 700^1 of either (i) cold elution buffer A (300mM NaCl, 50mM 
phosphate buffer, 250mM imidazole, pH 8) or (ii) elution buffer B (urea 8M, lOmM Tris-HCl, 
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lOOmM phosphate buffer, pH 4.5) and fractions collected until the O.D 2 so was 0.1. 21|il of each 
fraction were loaded on a 12% SDS gel. 

K) His-fusion proteins renaturation 

10% glycerol was added to the denatured proteins. The proteins were then diluted to 20(ig/ml using 
5 dialysis buffer I (10% glycerol, 0.5M arginine, 50mM phosphate buffer, 5mM reduced glutathione, 
0.5mM oxidised glutathione, 2M urea, pH 8.8) and dialysed against the same buffer at 4°C for 12- 
14 hours. The protein was further dialysed against dialysis buffer II (10% glycerol, 0.5M arginine, 
50mM phosphate buffer, 5mM reduced glutathione, 0.5mM oxidised glutathione, pH 8.8) for 12-14 
hours at 4°C. Protein concentration was evaluated using the formula: 

1 0 Protein (mg/ml) = ( 1 .55 x OD 280 ) - (0.76 x OD 26 o) 

L) His-fusion large-scale purification (ORFs 111-129) 

500ml of bacterial cultures were induced and the fusion proteins were obtained soluble in buffer 
Ml, M2 or M3 using the procedure described above. The crude extract of the bacteria was loaded 
onto a Ni-NTA superflow column (Quiagen) equilibrated with buffer Ml, M2 or M3 depending on 
15 the solubilization buffer of the fusion proteins. Unbound material was eluted by washing the 
column with the same buffer. The specific protein was eluted with the corresponding buffer 
containing 500mM imidazole and dialysed against the corresponding buffer without imidazole. 
After each run the columns were sanitized by washing with at least two column volumes of 0.5 M 
sodium hydroxide and reequilibrated before the next use. 

20 M) Mice immunisations 

20ng of each purified protein were used to immunise mice intraperitoneally. In the case of ORFs 2, 
4, 15, 22, 27, 28, 37, 76, 89 and 97, Balb-C mice were immunised with Al(OH) 3 as adjuvant on 
days 1,21 and 42, and immune response was monitored in samples taken on day 56. For ORFs 44, 
106 and 132, CD1 mice were immunised using the same protocol. For ORFs 25 and 40, CD1 mice 
25 were immunised using Freund's adjuvant, rather than AL(OH) 3 , and the same immunisation 
protocol was used, except that the immune response was measured on day 42, rather than 56. 
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Similarly, for ORFs 23, 32, 38 and 79, CD1 mice were immunised with Freund's adjuvant, but the 
immune response was measured on day 49. 

N) ELISA assay (sera analysis) 

The acapsulated MenB M7 strain was plated on chocolate agar plates and incubated overnight at 
5 37°C. Bacterial colonies were collected from the agar plates using a sterile dracon swab and 
inoculated into 7ml of Mueller-Hinton Broth (Difco) containing 0.25% Glucose. Bacterial growth 
was monitored every 30 minutes by following OD 6 20. The bacteria were let to grow until the OD 
reached the value of 0.3-0.4. The culture was centrifuged for 10 minutes at lOOOOrpm. The 
supernatant was discarded and bacteria were washed once with PBS, resuspended in PBS 

10 containing 0.025% formaldehyde, and incubated for 2 hours at room temperature and then 
overnight at 4°C with stirring. 100|il bacterial cells were added to each well of a 96 well Greiner 
plate and incubated overnight at 4°C. The wells were then washed three times with PBT washing 
buffer (0.1% Tween-20 in PBS). 200|il of saturation buffer (2.7% Polyvinylpyrrolidone 10 in 
water) was added to each well and the plates incubated for 2 hours at 37°C. Wells were washed 

15 three times with PBT. 200|Lil of diluted sera (Dilution buffer: 1% BSA, 0.1% Tween-20, 0.1% 
NaN 3 in PBS) were added to each well and the plates incubated for 90 minutes at 37°C. Wells were 
washed three times with PBT. 100|il of HRP-conjugated rabbit anti-mouse (Dako) serum diluted 
1:2000 in dilution buffer were added to each well and the plates were incubated for 90 minutes at 
37°C. Wells were washed three times with PBT buffer. 100|il of substrate buffer for HRP (25ml of 

20 citrate buffer pH5, lOmg of O-phenildiamine and 10|il of H 2 0) were added to each well and the 
plates were left at room temperature for 20 minutes. 100|il H 2 S0 4 was added to each well and 
OD 490 was followed. The ELISA was considered positive when OD490 was 2.5 times the respective 
pre-immune sera. 

O) FACScan bacteria Binding Assay procedure. 

25 The acapsulated MenB M7 strain was plated on chocolate agar plates and incubated overnight at 
37°C. Bacterial colonies were collected from the agar plates using a sterile dracon swab and 
inoculated into 4 tubes containing 8ml each Mueller-Hinton Broth (Difco) containing 0.25% 
glucose. Bacterial growth was monitored every 30 minutes by following OD 6 2o. The bacteria were 
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let to grow until the OD reached the value of 0.35-0.5. The culture was centrifuged for 10 minutes 
at 4000rpm. The supernatant was discarded and the pellet was resuspended in blocking buffer (1% 
BSA, 0.4% NaN3) and centrifuged for 5 minutes at 4000rpm. Cells were resuspended in blocking 
buffer to reach OD 62 o of 0.07. 100)il bacterial cells were added to each well of a Costar 96 well 
5 plate. lOO^il of diluted (1:200) sera (in blocking buffer) were added to each well and plates 
incubated for 2 hours at 4°C. Cells were centrifuged for 5 minutes at 4000rpm, the supernatant 
aspirated and cells washed by addition of 200(il/well of blocking buffer in each well. lOOpJ of R- 
Phicoerytrin conjugated F(ab)2 goat anti-mouse, diluted 1:100, was added to each well and plates 
incubated for 1 hour at 4°C. Cells were spun down by centrifugation at 4000rpm for 5 minutes and 
10 washed by addition of 200|il/well of blocking buffer. The supernatant was aspirated and cells 
resuspended in 200|jJ/well of PBS, 0.25% formaldehyde. Samples were transferred to FACScan 
tubes and read. The condition for FACScan setting were: FL1 on, FL2 and FL3 off; FSC-H 
threshold:92; FSC PMT Voltage: E 02; SSC PMT: 474; Amp. Gains 7.1; FL-2 PMT: 539; 
compensation values: 0. 

15 P) OMV preparations 

Bacteria were grown overnight on 5 GC plates, harvested with a loop and resuspended in 10 ml 
20mM Tris-HCL Heat inactivation was performed at 56°C for 30 minutes and the bacteria 
disrupted by sonication for 10 minutes on ice (50% duty cycle, 50% output). Unbroken cells were 
removed by centrifugation at 5000g for 10 minutes and the total cell envelope fraction recovered 

20 by centrifugation at 50000g at 4°C for 75 minutes. To extract cytoplasmic membrane proteins from 
the crude outer membranes, the whole fraction was resuspended in 2% sarkosyl (Sigma) and 
incubated at room temperature for 20 minutes. The suspension was centrifuged at lOOOOg for 10 
minutes to remove aggregates, and the supernatant further ultracentrifuged at 50000g for 75 
minutes to pellet the outer membranes. The outer membranes were resuspended in lOmM Tris- 

25 HC1, pH8 and the protein concentration measured by the Bio-Rad Protein assay, using BSA as a 
standard. 
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Q) Whole Extracts preparation 

Bacteria were grown overnight on a GC plate, harvested with a loop and resuspended in 1ml of 
20mM Tris-HCl. Heat inactivation was performed at 56°C for 30 minutes. 

R) Western blotting 

5 Purified proteins (500ng/lane), outer membrane vesicles (5jig) and total cell extracts (25jig) 
derived from MenB strain 2996 were loaded on 15% SDS-PAGE and transferred to a nitrocellulose 
membrane. The transfer was performed for 2 hours at 150mA at 4°C, in transferring buffer (0.3 % 
Tris base, 1.44 % glycine, 20% methanol). The membrane was saturated by overnight incubation at 
4°C in saturation buffer (10% skimmed milk, 0.1% Triton X100 in PBS). The membrane was 
10 washed twice with washing buffer (3% skimmed milk, 0.1% Triton XI 00 in PBS) and incubated 
for 2 hours at 37°C with mice sera diluted 1 :200 in washing buffer. The membrane was washed 
twice and incubated for 90 minutes with a 1:2000 dilution of horseradish peroxidase labelled anti- 
mouse Ig. The membrane was washed twice with 0.1% Triton XI 00 in PBS and developed with the 
Opti-4CN Substrate Kit (Bio-Rad). The reaction was stopped by adding water. 

1 5 S) Bactericidal assay 

MC58 strain was grown overnight at 37°C on chocolate agar plates. 5-7 colonies were collected 
and used to inoculate 7ml Mueller-Hinton broth. The suspension was incubated at 37°C on a 
nutator and let to grow until OD 6 20 was 0.5-0.8. The culture was aliquoted into sterile 1.5ml 
Eppendorf tubes and centrifuged for 20 minutes at maximum speed in a microfuge. The pellet was 
20 washed once in Gey's buffer (Gibco) and resuspended in the same buffer to an OD 6 20 of 0.5, 
diluted 1 :20000 in Gey's buffer and stored at 25°C. 

50|J of Gey's buffer/1% BSA was added to each well of a 96-well tissue culture plate. 25|il of 
diluted mice sera (1:100 in Gey's buffer/0.2% BSA) were added to each well and the plate 
incubated at 4°C. 25|il of the previously described bacterial suspension were added to each well. 
25 25^1 of either heat-inactivated (56°C waterbath for 30 minutes) or normal baby rabbit complement 
were added to each well. Immediately after the addition of the baby rabbit complement, 22^1 of 
each sample/well were plated on Mueller-Hinton agar plates (time 0). The 96-well plate was 
incubated for 1 hour at 37°C with rotation and then 22^1 of each sample/well were plated on 
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Mueller-Hinton agar plates (time 1). After overnight incubation the colonies corresponding to time 
0 and time 1 hour were counted. 

Table II gives a summary of the cloning, expression and prurification results. 
TABLE II - Summary of cloning, expression and purification 



ORF 


PCR/cloning 


His-fusion 
expression 


GST-fusion 
expression 


Purification 


orf 1 


+ 


+ 


+ 


His-fusion 


orf 2 


+ 


+ 


+ 


GST-fusion 


orf 2.1 


+ 


n.d. 


+ 


GST-fusion 


orf 4 


+ 


+ 


+ 


His-fusion 


orf 5 


+ 


n.d. 


+ 


GST-fusion 


orf 6 


+ 


+ 


+ 


GST-fusion 


orf 7 


+ 


+ 


+ 


GST-fusion 


orf 8 


+ 


n.d. 


n.d. 




orf 9 


+ 


+ 


+ 


GST-fusion 


orf 10 


+ 


n.d. 


n.d. 




orf 11 


+ 


n.d. 


n.d. 




orf 13 


+ 


n.d. 


+ 


GST-fusion 


orf 15 


+ 


+ 


+ 


GST-fusion 


orf 17 


+ 


n.d. 


n.d. 




orf 18 


+ 


n.d. 


n.d. 




orf 19 


+ 


n.d. 


n.d. 




orf 20 


+ 


n.d. 


n.d. 




orf 22 


+ 


+ 


+ 


GST-fusion 


orf 23 


+ 


+ 


+ 


T T* _£* 

His-fusion 


orf 24 




n.d. 


n.d. 




orf 25 


+ 


+ 


+ 


His-fusion 


orf 26 


+ 


n.d. 


n.d. 




orf 27 


+ 


+ 


+ 


GST-fusion 


orf 28 


+ 


+ 


+ 


GST-fusion 


orf 29 


+ 


n.d. 


n.d. 




orf 32 


+ 


+ 


+ 


His-fusion 


orf 33 


+ 


n.d. 


n.d. 




orf 35 


+ 


n.d. 


n.d. 




orf 37 


+ 


+ 


+ 


GST-fusion 


orf 58 


+ 


n.d. 


n.d. 




orf 65 


+ 


n.d. 


n.d. 




orf 66 


+ 


n.d. 


n.d. 




orf 72 


+ 


+ 


n.d. 


His-fusion 
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orf73 


+ 


n.d. 


+ 


n.d. 


orf75 


+ 


n.d. 


n.d. 




orf76 


+ 


+ 


n.d. 


His-fusion 


orf79 


+ 


+ 


n.d. 


His-fusion 


orf83 


+ 


n.d. 


+ 


n.d. 


orf84 


+ 


n.d. 


n.d. 




orf85 


+ 


n.d. 


+ 


GST-fusion 


orf89 


+ 


n.d. 


+ 


GST-fusion 


orf97 


+ 


+ 


+ 


GST-fusion 


orf98 


+ 


n.d. 


n.d. 




orf 100 


+ 


n.d. 


n.d. 




orf 101 


+ 


n.d. 


n.d. 




orf 102 


+ 


n.d. 


n.d. 




orf 103 


+ 


n.d. 


n.d. 




orf 104 


+ 


n.d. 


n.d. 




orf 105 


+ 


n.d. 


n.d. 




orf 106 


+ 


+ 


+ 


His-fusion 


orf 109 


+ 


n.d. 


n.d. 




orf 110 


+ 


n.d. 


n.d. 




orf 111 


+ 


+ 


n.d. 


His-fusion 


orf 113 


+ 


+ 


n.d. 


His-fusion 


orf 115 


n.d. 


n.d. 


n.d. 




orf 119 


+ 


+ 


n.d. 


His-fusion 


orf 120 


+ 


+ 


n.d. 


His-fusion 


orf 121 


+ 


n.d. 


n.d. 




orf 122 


+ 


+ 


n.d. 


His-fusion 


orf 125 


+ 


+ 


n.d. 


His-fusion 


orf 126 


+ 


+ 


n.d. 


His-fusion 


orf 127 


+ 


+ 


n.d. 


His-fusion 


orf 128 


+ 


n.d. 


n.d. 




orf 129 


+ 


+ 


n.d. 


His-fusion 


orf 130 


+ 


n.d. 


n.d. 




orf 131 


+ 


+ 


+ 


n.d. 


orf 132 


+ 


+ 


+ 


His-fusion 


orf 133 


+ 


n.d. 


+ 


GST-fusion 


orf 134 


+ 


n.d. 


n.d. 




orf 135 


+ 


n.d. 


n.d. 




orf 136 


+ 


n.d. 


n.d. 




orf 137 


+ 


n.d. 


+ 


GST-fusion 


orf 138 


+ 


n.d. 


+ 


GST-fusion 


orf 139 


+ 


n.d. 


n.d. 




orf 140 


+ 


n.d. 


n.d. 




orf 141 


+ 


n.d. 


n.d. 
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orf 142 


+ 


n.d. 


n.d. 




orf 143 


+ 


n.d. 


n.d. 




orf 144 


+ 


n.d. 


+ 


n.d. 


orf 147 


+ 


n.d. 


n.d. 





Example 1 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 1>] (SEP ID NO: 
11: 



1 ATGAAACAGA CAGTCAA . AT GCTTGCCGCC GCCCTGATTG CCTTGGGCTT 

51 GAACCGACCG GTGTGGNCGG ATGACGTATC GGATTTTCGG GAAAACTTGC 

101 A . GCGGCAGC ACAGGGAAAT GCAGCAGCCC AATACAATTT GGGCGCAATG 

151 TAT . TACAAA GGACGCGCGT GCGCCGGGAT GATGCTGAAG CGGTCAGATG 

201 GTATCGGCAG CCGGCGGAAC AGGGGTTAGC CCAAGCCCAA TACAATTTGG 

2 51 GCTGGATGTA TGCCAACGGG CGCGC.GTGC GCCAAGATGA TACCGAAGCG 

301 GTCAGATGGT ATCGGCAGGC GGCAGCGCAG GGGGTTGTCC AAGCCCAATA 

3.51 CAATTTGGGC GTGATATATG CCGAAGGACG TGGAGTGCGC CAAGACGATG 

4 01 TCGAAGCGGT CAGATGGTTT CGGCAGGCGG CAGCGCAGGG GGTAGCCCAA 

4 51 GCCCAAAACA ATTTGGGCGT GATGTATGCC GAAAGANCGC GCGTGCGCCA 

501 AGACCG. . . 

This corresponds to the amino acid sequence [<SEQ ID 2; ORF37>] (SEP ID NO: 2; ORF37) : 



1 MKQTVXMLAA ALIALGLNRP VWXDDVSDFR ENLXAAAQGN AAAQYNLGAM 

51 YXQRTRVRRD DAEAVRWYRQ PAEQGLAQAQ YNLGWMYANG RXVRQDDTEA 

101 VRWYRQAAAQ GWQAQYNLG VIYAEGRGVR QDDVEAVRWF RQAAAQGVAQ 

151 AQNNLGVMYA ERXRVRQD . . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 3>1 (SEP ID NP: 3): 



1 ATGAAACAGA CAGTCAAATG GCTTGCCGCC GCCCTGATTG CCTTGGGCTT 

51 GAACCGAGCG GTGTGGGCGG ATGACGTATC GGATTTTCGG GAAAACTTGC 

101 AGGCGGCAGC ACAGGGAAAT GCAGCAGCCC AATACAATTT GGGCGCAATG 

151 TATTACAAAG GACGCGGCGT GCGCCGGGAT GATGCTGAAG CGGTCAGATG 

201 GTATCGGCAG GCGGCGGAAC AGGGGTTAGC CCAAGCCCAA TACAATTTGG 

251 GCTGGATGTA TGCCAACGGG CGCGGCGTGC GCCAAGATGA TACCGAAGCG 

301 GTCAGATGGT ATCGGCAGGC GGCAGCGCAG GGGGTTGTCC AAGCCCAATA 

351 CAATTTGGGC GTGATATATG CCGAAGGACG TGGAGTGCGC CAAGACGATG 

4 01 TCGAAGCGGT CAGATGGTTT CGGCAGGCGG CAGCGCAGGG GGTAGCCCAA 

4 51 GCCCAAAACA ATTTGGGCGT GATGTATGCC GAAAGACGCG GCGTGCGCCA 

501 AGACCGCGCC CTTGCACAAG AATGGTTTGG CAAGGCTTGT CAAAACGGAG 

551 ACCAAGACGG CTGCGACAAT GACCAACGCC TGAAGGCGGG TTATTGA 

This corresponds to the amino acid sequence [<SEQ ID 4; PRF37-1>] (SEPIDNP: 4: PRF37-U: 



1 MKQTVKWLAA ALIALGLNRA VWAD DVSDFR ENLQAAAQGN AAAQYNLGAM 

51 YYKGRGVRRD DAEAVRWYRQ AAEQGLAQAQ YNLGWMYANG RGVRQDDTEA 

101 VRWYRQAAAQ GWQAQYNLG VIYAEGRGVR QDDVEAVRWF RQAAAQGVAQ 

151 AQNNLGVMYA ERRGVRQDRA LAQEWFGKAC QNGDQDGCDN DQRLKAGY* - 
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Further work identified the corresponding gene in strain A of N. meningitidis [<SEQ ID 5>] (SEO 
ID NO: 5) : 

1 ATGAAACAGA CAGTCAAATG GCTTGCCGCC GCCCTGATTG CCTTGGGCTT 

51 GAACCAAGCG GTGTGGGCGG ATGACGTATC GGATTTTCGG GAAAACTTGC 

5 101 AGGCGGCAGC ACAGGGAAAT GCAGCAGCCC AAAACAATTT GGGCGTGATG 

151 TATGCCGAAA GACGCGGCGT GCGCCAAGAC CGCGCCCTTG CACAAGAATG 

201 GCTTGGCAAG GCTTGTCAAA ACGGATACCA AGACAGCTGC GACAATGACC 

251 AACGCCTGAA AGCGGGTTAT. TGA 

10 This encodes a protein having amino acid sequence [<SEQ ID 6; ORF37a>] (SEP ID NO: 6; 
ORF37a) : 

1 MKQTVKWLAA ALIALGLNQA VWAD DVSDFR ENLQAAAQGN AAAQNNLGVM 
51 YAERRGVRQD RALAQEWLGK ACQNGYQDSC DNDQRLKAGY * 

The originally-identified partial strain B sequence (ORF37) (SEO ID NO: 2) shows 68.0% identity 
1 5 over a 75aa overlap with ORF37a (SEO ID NO: 6) : 

10 20 30 40 50 60 

or f 3 7 . pep MKQTVXMLAAALIALGLNRPVWX DDVSDFRENLXAAAQGNAAAQYNLGAMYXQRTRVRRD 

Mill lllllllllll: II llllllllll llllllllll llhll H Ihl 
orf3 7a MKQTVKWLAAALIALGLNQAVWAD DVSDFRENLQAAAQGNAAAQNNLGVMYAERRGVRQD 

20 10 20 30 40 50 60 

70 80 90 100 110 120 

or f 3 7 . pep DAEAVRWYRQPAEQGLAQAQYNLGWMYANGRXVRQDDTEAVRWYRQAAAQGWQAQ YNLG 

11*1 • ::| 
or f 3 7 a RALAQEWLGKACQNG YQDS CDNDQRLKAG YX 

25 70 80 90 

Further work identified the corresponding gene in N. gonorrhoeae [<SEQ ID 7>] (SEO ID NO: 7 ): 



1 ATGAAACAGA CAGTCAAATG GCTTGCCGCC GCCCTGATTG CCTTGGGCTT 

51 GAACCAAGCG GTGTGGGCGG GTGACGTATC GGATTTTCGG GAAAACTTGC 

30 101 AGgcggcaGA ACaggGAAAT GCAGCAGCCC AATTCAATTT GGGCGTGATG 

151 TATGAAAATG GACAAGGAGT TCGTCAAGAT TATGTACAGG CAGTGCAGTG 

201 GTATCGCAAG GCTTCAGAAC AAGGGGATGC CCAAGCCCAA TACAATTTGG 

251 GCTTGATGTA TTACGATGGA CGCGGCGTGC GCCAAGACCT TGCGCTCGCT 

301 CAACAATGGC TTGGCAAGGC TTGTCAAAAC GGAGACCAAA ACAGCTGCGA 

35 351 CAATGACCAA CGCCTGAAGG CGGGTTATTA A 

This encodes a protein having amino acid sequence [<SEQ ID 8; ORF37ng>] (SEO ID NO: 8; 
ORF37ng): 



40 



i 

51 
101 



MKQTVKWLAA ALIALGLNQA VWA GDVSDFR ENLQAAEQGN AAAQFNLGVM 
YENGQGVRQD YVQAVQWYRK ASEQGDAQAQ YNLGLMYYDG RGVRQDLALA 
QQWLGKACQN GDQNSCDNDQ RLKAGY* 



10 
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The originally-identified partial strain B sequence (ORF37) (SEP ID NO: 2) shows 64.9% identity 
over a 1 1 1 aa overlap with ORF37ng (SEP ID NO: 8) : 

orf 37 .pep MKQTVXMLAAALIALGLNRPWXDDVSDFRENI^CAAAQGNAAAQYNLGAMYXQRTRVRRD 60 

Mill MINIMI h II MINIM II 1 1 1 1 1 1 1 : 1 1 1 = 1 1 : INI 

orf 3 7ng MKQTVKWLAAALI ALGLNQAVWAGDVSDFRENLQAAEQGNAAAQFNLGVMYENGQGVRQD 6 0 

or f 3 7 . pep DAEAVRWYRQPAEQGLAQAQYNLGWMYANGRXVRQDDTEAVRWYRQAAAQGWQAQYNLG 12 0 

- NUN Nil I I I I ! 1 I I || N | | I I : I N N I 
orf37ng YVQAVQWYRKASEQGDAQAQYNLGLMYYDGRGVRQDLALAQQWLGKACQNGDQNSCDNDQ 120 

orf 37 .pep V I YAEGRGVRQDDVEAVRWFRQAAAQGVAQAQNNLGVM YAERXRVRQD 168 

orf37ng RLKAGY 126 

The complete strain B sequence (PRF37-1) (SEP ID NP: 4) and PRF37ng (SEP ID NP: 8) show 
51.5% identity in 198 aa overlap: 

15 10 20 30 40 50 60 

or f 3 7 - 1 pep MKQTVKWLAAALI ALGLNRAVWADDVSDFRENLQAAAQGNAAAQYNLGAMYYKGRGVRRD 

NN INI II INI II INN III I II II II II I II I II IN INN = hi I hi 

orf 3 7ng MKQTVKWLAAALIALGLNQAVWAGDVSDFRENLQAAEQGNAAAQFNLGVMYENGQGVRQD 

10 20 30 40 50 60 

20 70 80 90 100 110 120 

or f 3 7 - 1 . pep DAEAVRWYRQAAEQGLAQAQYNLGWMYANGRGVRQDDTEAVRWYRQAAAQGWQAQYNLG 

::|hllhhlll II II NININ 

orf 3 7ng YVQAVQWYRKASEQGDAQAQYNLGLMYYDGRGVRQD 

70 80 90 

25 130 140 150 160 170 180 

or f 3 7 - 1 . pep VI YAEGRGVRQDDVEAVRWFRQAAAQGVAQAQNNLGVMYAERRGVRQDRALAQEWFGKAC 

Mlhhllll 

or f 3 7ng LALAQQWLGKAC 

100 

30 190 199 

or f 3 7 - 1 . pep QNGDQDGCDNDQRLKAGYX 



35 



I I I I I : : I I I I t III 
orf 3 7ng QNGDQNSCDNDQRLKAGYX 
110 120 



Computer analysis of these amino acid sequences indicates a putative leader sequence, and it was 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

PRF37-1 (SEP ID NP: 4) (llkDa) was cloned in pET and pGex vectors and expressed in E.colU 
40 as described above. The products of protein expression and purification were analyzed by SDS- 
PAGE. Figure 1 A shows the results of affinity purification of the GST-fusion protein, and Figure 
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1B shows the results of expression of the His-fusion in KcolL Purified GST-fusion protein was 
used to immunise mice, whose sera were used for ELISA (positive result), FACS analysis (Figure 
1C), and a bactericidal assay (Figure ID). These experiments confirm that ORF37-1 (SEP ID NO: 
4} is a surface-exposed protein, and that it is a useful immunogen. 

5 Figure IE shows plots of hydrophilicity, antigenic index, and AMPHI regions for ORF37-1 (SEQ 
ID NO: 4) . 

Example 2 

The following partial DNA sequence was identified in ^meningitidis [<SEQ ID 9>] (SEP ID NO: 

91: 



10 TTCGGCGA CATCGGCGGT TTGAAGGTCA ATGCCCCCGT CAAATCCGCA 

GGCGTATTGG TCGGGCGCGT CGGCGCTATC GGACTTGACC CGAAATCCTA 

TCAGGCGAGG GTGCGCCTCG ATTTGGACGG CAAGTATCAG TTCAGCAGCG 

ACGTTTCCGC GCAAATCCTG ACTTCsGGAC TTTTGGGCGA GCAGTACATC 

GGGCTGCAGC AGGGCGGCGA CACGGAAAAC CTTGCTGCCG GCGACACCAT 
15 CTCCGTAACC AGTTCTGCAA TGGTTCTGGA AAACCTTATC GGCAAATTCA 

TGACGAGTTT TGCCGAGAAA AATGCCGACG GCGGCAATGC GGAAAAAGCC 

GCCGAATAA 

This corresponds to the amino acid sequence [<SEQ ID 10>] (SEP ID NO: 10) : 

20 1 FGDIGGLKVN APVKSAGVLV GRVGAIGLDP KSYQARVRLD LDGKYQFSSD 

51 VSAQILTSGL LGEQYIGLQQ GGDTENLAAG DTISVTSSAM VLENLIGKFM 
101 TSFAEKNADG GNAEKAAE* 

Computer analysis of this amino acid sequence gave the following results: 

25 Homology with a hypothetical H.influenzae protein fybrd.haein: accession number p45029 (SEP 
IDNP: 1105)) 

SEQ ID NP: 9 and ybrd.haein (SEP IDNP: 1105) show 48.4% aa identity in 122 aa overlap: 



20 30 40 50 60 70 

yrbd . h LGIGALVFLGLRVANVQGFAETKSYTVTATFDNIGGLKVRAPLKIGGWIGRVSAITLDE 

30 I : : I I I I I I : I I : I : I h : I II : I I = I I 

N.m FGDIGGLKVNAPVKSAGVLVGRVGAIGLDP 

10 20 30 
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80 90 100 110 120 130 

yrbd.h KSYLPKVSIAINQEYNEIPENSSLSIKTSGLLGEQYIALTMGFDDGDTAMLKNGSQIQDT 

Ml ::|::::: :| - I I Mlllllllhl I Mh I :|= I I 

N . m KS YQARVRLDLDGKY - QFSSDVSAQ I LTS GLLGEQY I GLQQG GDTENLAAGDT I S VT 

5 40 50 60 70 80 

140 150 160 

yrbd . h TS AMVLEDL I GQ FL - - YGSKKSDGNEKSESTEQ 
:||||||:|||:|: :::|::||:: ::::|: 
N . m S S AMVLENL I GKFMTS FAEKNADGGNAEKAAEX 
10 90 100 110 120 

Homology with a predicted ORF from N.gonorrhoeae 

SEQ ID NO: 9 shows 99.2% identity over a 118aa overlap with a predicted ORF from N. 
gonorrhoeae (SEP TP NO: 1106vrbx) : 

20 30 40 50 60 70 

1 5 yrbd GAAAVAFLAFRVAGGAAFGGSDKT YAVYAD FGD I GGL KVNAP VKS AGVLVGRVGA I GLDP 

MMIMIIIIIIIIMI I IIIMMI! 

N . m FGD I GGLKVNAPVKS AGVLVGRVGAI GLDP 

10 20 30 

80 90 100 110 120 130 

20 yrbd KS YQARVRLDLDGKYQ FS SDVS AQ I LTS GLLGEQY I GLQQGGDTENLAAGDT ISVTS SAM 

MMM IMMIMI MM IMM I MM II MUM 1 1 Mill MIMI II II I II 

N . m KSYQARVRLDLDGKYQFSSDVSAQ I LTS GLLGEQY I GLQQGGDTENLAAGDT ISVTS SAM 

40 50 60 70 80 90 

140 150 160 

25 yrbd VLENL I GKFMTS FAEKNAEGGNAEKAAEX 

M MMMMMMMMMMIMM 

N . m VLENL I GKFMTS FAEKNADGGNAEKAAEX 

100 110 120 

30 The complete yrbd HJnfluenzae sequence has a leader sequence and it is expected that the full- 
length homologous N. meningitidis protein will also have one. This suggests that it is either a 
membrane protein, a secreted protein, or a surface protein and that the protein, or one of its 
epitopes, could be a useful antigen for vaccines or diagnostics. 

Example 3 

35 The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 11>] (SEP ID 
NO: 11) : 

1 . .ATTTTGATAT ACCTCATCCG CAAGAATCTA GGTTCGCCCG TCTTCTTCTT 

51 TCAGGAACGC CCCGGAAAGG ACGGAAAACC TTTTAAAATG GTCAAATTCC 

101 GTTCCATGCG CGACGGCTTG TATTCAGACG GCATTCCGCT GCCCGACGGA 

40 151 GAACGCCTGA CACCGTTCGG CAAAAAACTG CGTGCCGcCA GTwTGGACGA 
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201 


ACTGCCTGAA 


TTATGGAATA 


251 


CCCGCCCGCT 


GCTGATGCAA 


301 


CGCCGCCACG 


AAATGAAACC 


351 


GCGCAACGCg 


CTTTCGTGGG 


401 


TCGACCACTT 


CAGCCTGTGC 


451 


AAAAAAGTAT 


TAATCAAGGA 


501 


GCCCCCTTTC 


ACAGGAAAAC 


551 


ACGGAAAAGT 


CGTTGCCGAC 


601 


ATCGTTTTTC 


TGGACGACCG 


651 


CATCGGCACG 


ACGCTGCTGC 


701 


ACGTCGCCGT 


CGCCGTCGGC 


751 


AAAGCCGCCG 


CGCTCGGCTT 


801 


GACCGTCTCG 


CCTTCTGCAA 


851 


AAGCGGTCG . 





TCTTAAAAGG CGAGATGAGC CTGGTCGGCC 
TATCTGCCGC TGTACGACAA CTTCCAAAAC 
CGGCATTACC GGCTGGGCGC AGGTCAACGG 
ACGAAAAATT CGCCTGCGAT GTTTGGTATA 
CTCGACATCA AAATCCTACT GCTGACGGTT 
AGGGATTTCC GCACAGGGCG AACA . aCCAT 
GCAAACTCGC CGTCGTCGGT GCGGGCGGAC 
CTTGCCGCCG CACTCGGCCG GTACAGGGAA 
CGCACAAGGC AGCGTCAACG GCTTTTCCGT 
TTGAAAACAG TTTATCGCCC GAACAATACG 
AACAACCGCA TCCGCCGCCA AATCGCCGAA 
CGCCCTGCCC GTACTGGTTC ATCCGGACGC 
CAGTCGGACA AGGCAGCGTC GTTATGGCGA 



This corresponds to the amino acid sequence [<SEQ ID 12; ORF3>] (SEP ID NO: 12;ORF3) : 



1 . . ILIYLI RKNL GSPVFFFQER PGKDGKPFKM VKFRSMRDGL YSDGIPLPDG 

51 ERLTPFGKKL RAASXDELPE LWNILKGEMS LVGPRPLLMQ YLPLYDNFQN 

101 RRHEMKPGIT GWAQVNGRNA LSWDEKFACD VWYIDHFSLC LDIKILLLTV 

151 KKVLIKEGIS AQGEXTMPPF TGKRKLAWG AGGHGKWAD LAAALGRYRE 

201 IVFLDDRAQG SVNGFSVIGT TLLLENSLSP EQYDVAVAVG NNRIRRQIAE 

251 KAAALGFALP VLVHPDATVS PSATVGQGSV VMAKAV. . 

Further sequence analysis revealed the complete nucleotide sequence [<SEQ ID 13>] (SEP ID 
NP: 13) : 



1 ATGAGTAAAT TCTTCAAACG CCTGTTTGAC ATTGTTGCCT CCGCCTCGGG 

51 ACTGATTTTC CTCTCGCCAG TATTTTTGAT TTTGATATAC CTCATCCGCA 

101 AGAATCTAGG TTCGCCCGTC TTCTTCTTTC AGGAACGCCC CGGAAAGGAC 

151 GGAAAACCTT TTAAAATGGT CAAATTCCGT TCCATGCGCG ACGCGCTTGA 

201 TTCAGACGGC ATTCCGCTGC CCGACGGAGA ACGCCTGACA CCGTTCGGCA 

251 AAAAACTGCG TGCCGCCAGT TTGGACGAAC TGCCTGAATT ATGGAATATC 

301 TTAAAAGGCG AGATGAGCCT GGTCGGCCCC CGCCCGCTGC TGATGCAATA 

3 51 TCTGCCGCTG TACGACAACT TCCAAAACCG CCGCCACGAA ATGAAACCCG, 
401 GCATTACCGG CTGGGCGCAG GTCAACGGGC GCAACGCGCT TTCGTGGGAC 

4 51 GAAAAATTCG CCTGCGATGT TTGGTATATC GACCACTTCA GCCTGTGCCT 
501 CGACATCAAA ATCCTACTGC TGACGGTTAA AAAAGTATTA ATCAAGGAAG 
551 GGATTTCCGC ACAGGGCGAA GCCACCATGC CCCCTTTCAC AGGAAAACGC 
601 AAACTCGCCG TCGTCGGTGC GGGCGGACAC GGAAAAGTCG TTGCCGACCT 
651 TGCCGCCGCA CTCGGCCGGT ACAGGGAAAT CGTTTTTCTG GACGACCGCG 
701 CACAAGGCAG CGTCAACGGC TTTTCCGTCA TCGGCACGAC GCTGCTGCTT 
751 GAAAACAGTT TATCGCCCGA ACAATACGAC GTCGCCGTCG CCGTCGGCAA 
801 CAACCGCATC CGCCGCCAAA TCGCCGAAAA AGCCGCCGCG CTCGGCTTCG 
851 CCCTGCCCGT TCTGGTTCAT CCGGACGCGA CCGTCTCGCC TTCTGCAACA 
901 GTCGGACAAG GCAGCGTCGT TATGGCGAAA GCCGTCGTAC AGGCAGGCAG 
951 CGTATTGAAA GACGGCGTGA TTGTGAACAC TGCCGCCACC GTCGATCACG 

1001 ACTGCCTGCT TAACGCTTTC GTCCACATCA GCCCAGGCGC GCACCTGTCG 

1051 GGCAACACGC ATATCGGCGA AGAAAGCTGG ATAGGCACGG GCGCGTGCAG 

1101 CCGCCAGCAG ATCCGTATCG GCAGCCGCGC AACCATTGGA GCGGGCGCAG 

1151 TCGTCGTACG CGACGTTTCA GACGGCATGA CCGTCGCGGG CAATCCGGCA 

12 01 AAGCCGCTGC CGCGCAAAAA CCCCGAGACC TCGACAGCAT AA 

This corresponds to the amino acid sequence [<SEQ ID 14; PRF3-1>] (SEPIDNP: 14: PRF3-1) : 



1 MSKFFKRLFD IVASASGLIF LSPVFLILIY LIRKNLGSPV FFFQERPGKD 
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51 GKPFKMVKFR SMRDALDSDG IPLPDGERLT PFGKKLRAAS LDELPELWNI 
101 LKGEMSLVGP RPLLMQYLPL YDNFQNRRHE MKPGITGWAQ VNGRNALSWD 
151 EKFACDVWYI DHFS LCLDIK I LLLTVKKVL I KEGISAQGE ATMPPFTGKR 
201 KLAWGAGGH GKWADLAAA LGRYREIVFL DDRAQGSVNG FSVIGTTLLL 
5 251 ENSLSPEQYD VAVAVGNNRI RRQIAEKAAA LGFALPVLVH PDATVSPSAT 

301 VGQGSWMAK AWQAGSVLK DGVIVNTAAT VDHDCLLNAF VHISPGAHLS 
351 GNTHIGEESW IGTGACSRQQ IRIGSRATIG AGAVWRDVS DGMTVAGNPA 
401 KPLPRKNPET STA* 

10 Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N.meninsitidis (strain A) 

ORF3 (SEP ID NO: 12) shows 93.0% identity over a 286aa overlap with an ORF (ORF3a) (SEO 
ID NO: 16) from strain A of N. meningitidis: 

10 20 30 

]5 orf3 pep ILIYLI RKNLGSPVFFFQERPGKDGKPFKMVKFR 

IIMIIIIIMIMIIIIIMIIIIIIIIIIIII 

orf3a MSKFFKRLFDIVAS ASGLIFLSPVFLILIYLI RKNLGSPVFFFQERPGKDGKPFKMVKFR 

10 20 30 . 40 50 60 

40 50 60 70 80 90 

20 orf 3 . pep SMRDGLYSDGIPLPDGERLTPFGKKLRAASXDELPELWNILKGEMSLVGPRPLLMQYLPL 

Ihhl I I I I I I I I I I I I M I I I I I I I I I 1 I I I I : I 1 I : I I I I I t 1 I I I 

orf 3a SMHDALDSDGILLPDGERLTPFGKKLRAASLDELPELWNVLKGDMSLVGPRPLLMQYLPL 

70 80 90 100 110 120 

100 110 120 130 140 150 

25 orf 3 .pep YDNFONRRHEMKPG I TGWAOVNGRNALSWDEKFACDVWYIDHFS LCLD1KI LLLTVKKVL 

. 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 Mi 1 1 1 1 1 i ,1 1 h 1 1 1 h 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 M M II I 

or f 3 a YDNFQNRRHEMKPG I TGWAQVNGRNALS WDERFACD I W Y I DHFS LCLD I KI LLLTVKKVL 

130 140 150 160 170 180 

160 170 180 190 200 210 

30 orf 3 .pep IKEGISAQGEXTMPPFTGKRKIjAWGAGGHGKWADLAAALGRYREIVFLDDRAQGSVNG 

TlllllMII lllllllllllllllllllllllhllllll I lllllllhllllll 

or f 3 a I KEGI SAQGEATMPP FTGKRKLAWGAGGHGKWAELAAALGTYGE I VFLDDRVQGSVNG 

190 200 210 220 230 240 

220 230 240 250 260 270 

35 orf 3 . pep FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT 

I I I I I I I I I M I I I I M :| : I I I I I I I I I I I I I I I I I I I I I M I i I h I I h I I I I I I 
orf 3a FPVIGTTLLLENSLSPEQFDIAVAVGNNRIRRQIAEKAAALGFALPVLIHPDSTVSPSAT 

250 260 270 280 290 300 

280 

40 orf 3 .pep VGQGSWMAKAV 

Nihil MM 

orf 3a VGQGGWMAKAWQADSVLKDGVIVNTAATVDHDCLLDAFVHISPGAHLSGNTRIGEESW 

310 320 330 340 350 360 

The complete length ORF3a nucleotide sequence [<SEQ ID 15>] (SEO ID NO: 15) is: 



45 



1 ATGAGTAAAT TCTTCAAACG CCTGTTTGAC ATTGTTGCCT CCGCCTCGGG 
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51 ACTGATTTTC CTCTCGCCAG TATTTTTGAT TTTGATATAC CTCATCCGCA 

101 AGAATCTGGG TTCGCCCGTC TTCTTCTTTC AGGAACGCCC CGGAAAGGAC 

151 GGAAAACCTT TTAAAATGGT CAAATTCCGT TCCATGCACG ACGCGCTTGA 

201 TTCAGACGGC ATTCTGCTGC CCGACGGAGA ACGCCTGACA CCGTTCGGCA 

251 AAAAACTGCG TGCCGCCAGT TTGGACGAAC TGCCCGAACT GTGGAACGTC 

3 01 CTCAAAGGCG ACATGAGCCT GGTCGGCCCC CGCCCGCTGC TGATGCAATA 

351 TCTGCCGCTG TACGACAACT TCCAAAACCG CCGCCACGAA ATGAAACCGG 

401 GCATTACCGG CTGGGCGCAG GTCAACGGGC GCAACGCGCT TTCGTGGGAC 

451 GAACGCTTCG CATGCGACAT CTGGTATATC GACCACTTCA GCCTGTGCCT 

501 CGACATCAAA ATCCTACTGC TGACGGTTAA AAAAGTATTA ATCAAAGAAG 

551 GGATTTCCGC ACAGGGCGAA GCCACCATGC CCCCTTTCAC AGGAAAACGC 

601 AAACTTGCCG TCGTCGGTGC GGGCGGACAC GGCAAAGTCG TTGCCGAGCT 

651 TGCCGCCGCA CTCGGCACAT ACGGCGAAAT CGTTTTTCTG GACGACCGCG 

701 TCCAAGGCAG CGTCAACGGC TTCCCCGTCA TCGGCACGAC GCTGCTGCTT 

751 GAAAACAGTT TATCGCCCGA ACAATTCGAC ATCGCCGTCG CCGTCGGCAA 

801 CAACCGCATC CGCCGCCAAA TCGCCGAAAA AGCCGCCGCG CTCGGCTTCG 

851 CCCTGCCCGT CCTGATTCAT CCGGACTCGA CCGTCTCGCC TTCTGCAACA 

901 GTCGGACAAG GCGGCGTCGT TATGGCGAAA GCCGTCGTAC AGGCTGACAG 

951 CGTATTGAAA GACGGCGTAA TTGTGAACAC TGCCGCCACC GTCGATCACG 

1001 ATTGCCTGCT TGATGCTTTC GTCCACATCA GCCCGGGCGC GCACCTGTCG 

1051 GGCAACACGC GTATCGGCGA AGAAAGCTGG ATAGGCACAG GCGCGTGCAG 

1101 CCGCCAGCAG ATCCGTATCG GCAGCCGCGC AACCATTGGA GCGGGCGCAG 

1151 TCGTCGTGCG CGACGTTTCA GACGGCATGA CCGTCGCGGG CAACCCGGCA 

1201 AAACCATTGG CAGGCAAAAA TACCGAGACC CTGCGGTCGT AA 

This is predicted to encode a protein having amino acid sequence[<SEQ ID 16>] (SEP ID NO: 
16}: 



1 MSKFFKRLFD IVAS ASGLIF LSPVFLILIY LI RKNLGSPV FFFQERPGKD 

51 GKPFKMVKFR SMHDALDSDG ILLPDGERLT PFGKKLRAAS LDELPELWNV 

101 LKGDMSLVGP RPLLMQYLPL YDNFQNRRHE MKPGITGWAQ VNGRNALSWD 

151 ERFACDIWYI DHFS LCLDIK ILLLTVKKVL I KEGISAQGE ATMPPFTGKR 

201 KLAWGAGGH GKWAELAAA LGTYGEIVFL DDRVQGSVNG FPVIGTTLLL 

251 ENSLSPEQFD IAVAVGNNRI RRQIAEKAAA LGFALPVLIH PDSTVSPSAT 

3 01 VGQGGWMAK AWQADSVLK DGVIVNTAAT VDHDCLLDAF VHISPGAHLS 
351 GNTRIGEESW IGTGACSRQQ IRIGSRATIG AGAVWRDVS DGMTVAGNPA 

4 01 KPLAGKNTET LRS* 

Two transmembrane domains are underlined. 



ORF3-1 fSEOIDNO: 14) shows 94.6% identity in 410 aa overlap with ORF3a (SEP ID NO: 16) : 



10 20 30 40 50 60 

MSKFFKRLFD I VASASGL I FLSPVFLI LI YLIRKNLGSPVFFFQERPGKDGKPFKMVKFR 

M 1 1 M M 1 1 1 1 1 1 1 1 M 1 1 M I II , 1 1 1 1 1 1 1 1 II Ml i I 1 1 1 1 M 1 1 1 1 1 1 1 1 M 

MSKFFKRLFDIVASASGLIFLSPVFLILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR 
10 20 30 40 50 60 

70 80 90 100 110 120 

SMHDALDSDG I LLPDGERLTPFGKKLRAASLDELPELWNVLKGDMSLVGPRPLLMQYLPL 

Ihllllllll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M i 1 1 M I M 1 1 M 1 1 1 1 1 M 1 1 1 1 1 M 

SMRDALDSDGIPLPDGERLTPFGKKLRAASLDELPELWNILKGEMSLVGPRPLLMQYLPL 
70 80 90 100 110 120 

130 140 150 160 170 180 

YDN FQNRRHEMKPG I TGWAQVNGRNALS WDERFACD I W Y I DH FS LCLD I KI LLLTVKKVL 



orf 3a .pep 
orf3-l 

orf 3a .pep 
orf 3-1 

orf 3a .pep 
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M ■ I 1 1 i ] 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M II 1 1 h I II M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 

orf 3 - 1 YDNFQNRRHEMKPGITGWAQVNGRNALSWDEKFACDVWYIDHFSLCLDIKILLLTVKKVL 

130 140 150 160 170 180 

190 200 210 220 230 240 

5 orf 3a . pep IK^GISAQGEATMPPFTGKRKLAWGAGGHGKWAEIAAALGTYGEIVFLDDRVQGSVNG 

1 1 1 M II 1 1 1 II II I! 1 1 1 1 1 1 1 1 1 M > 1 1 1 1 h 1 1 M 1 1 I IIIIIMhIIIIII 

orf 3 - 1 IKEGISAQGEATMPPFTGKRKLAWGAGGHGKWADLAAALGRYREIVFLDDRAQGSVNG 

190 200 210 220 230 240 

250 260 270 280 290 300 

1 0 orf 3a . pep FPVIGTTLLLENSLSPEQFDIAVAVGNNRIRRQIAEKAAALGFALPVLIHPDSTVSPSAT 

I I I I I I I I I I I II I I : hi I I I I I I I M I I I I I II I I I I I I I h II h II I I I I 
orf 3 - 1 FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT 

250 260 270 280 290 300 

310 320 330 340 350 360 

1 5 orf 3a . pep VGQGGVVMAKAVVQADS VLKDGVI VNTAATVDHDCLLDAFVH I S PGAHLSGNTR I GEES W 

I ; 1 : 1 1 1 1 1 1 1 1 1 MINIUM lillllllMIIIIMIIU Ihllllll 

orf 3 - 1 VGQGSVVMAKAWQAGS VLKDGVI WTAATVDHDCLLNAFVH I S PGAHLSGNTH I GEESW 

310 320 330 340 350 360 

370 380 390 400 410 

20 orf 3a .pep I GTGACSRQQ I R I GSRAT I GAGAVWRDVSDGMTVAGNPAKPLAGKNTETLRSX 

M llllllllllllllllllll IMIIIIMIIIIIIIII II II 

orf 3 - 1 I GTGACSRQQ I R I GSRAT I GAGAVWRDVSDGMTVAGNPAKPLPRKNPETSTAX 

370 380 390 400 410 



Homology with hypothetical protein encoded by vvfc gene (accession Z71928) (SEP ID NO: 
25 1108) ofB. subtilis 

ORF3 (SEP ID NO: 12) and YVFC proteins (SEP ID NO: 1108) show 55% aa identity in 170 aa 
overlap (BLASTp): 



0RF3 3 IYLIRKNLGS PVFFFQERPGKDGKPFKMVKFRSMRDGLYSDGIPLPDGERLTPFGKKLRA 62 
I ++R +GSPVFF Q RPG GKPF + KFR+M D* S G LPD RLT G+ +R 
30 yvfc 2 7 IAWRLKIGS PVFFKQVRPGLHGKPFTLYKFRTMTDERDSKGNLLPDEVRLTKTGRLIRK 86 

ORF3 63 ASXDELPELWNILKGEMSLVGPRPLLMQYLPLYDNFQNRRHEMKPGITGWAQVNGRNALS 122 

S DELP+L N+LKG++SLVGPRPLLM YLPLY Q RRHE + KPG I TGWAQ +NGRNA + S 
yvfc 87 LSIDELPQLLNVLKGDLSLVGPRPLLMDYLPLYTEKQARRHEVKPGITGWAQINGRNAIS 14 6 

0RF3 123 WDEKFACDVWYIDHFSLCLDXXXXXXXXXXXXXXEGISAQGEXTMPPFTG 172 
35 W++KF DVWY+D++S LD EGI T FTG 

yvfc 14 7 WEKKFELDVWYVDNWS FFLDLKI LCLTVRKVLVSEG I QQTNHVTAERFTG 196 



Homology with a predicted PRF from N.gonorrhoeae 



PRF3 fSEP ID NP: 12) shows 86.3% identity over a 286aa overlap with a predicted PRF 
(PRF3.ng) (SEPIDNP: 18) from N. gonorrhoeae: 
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orf3 I L I YLI RKNLGSPVFFFQERPGKDGKPFKMVKFR 34 

:|||lllll lllll|::|lllllllllllllll 

orf3ng MSKAVKRLFDIIAS ASGLIVLSPVFLVLIYLI RKNKGSPVFFIRERPGKDGKPFKMVKFR 60 

orf3 SMRDGLYSDGIPLPDGERLTPFGKKLRAASXDELPELWNILKGEMSLVGPRPLLMQYLPL 94 

5 1 1 1 1 I lllllllhllll llllllh! : I 1 1 1 i hi I M 1 1 1 1 1 1 1 1 1 M M I 

orf 3ng SMRDALDSDGIPLPDSERLTDFGKKLRATSLDELPELWNVLKGEMSLVGPRPLLMQYLPL 120 

O r f 3 YDNFQNRRHEMKPG I TGWAQVNGRNALS WDE KFACDVW Y I DHFS LCLD I KI LLLTVKKVL 154 

|::||lllll IMIIIIIIIIIIIIMIII MUM Ml: I I = I I I : I I I I I I I 

O r f 3 ng YNKFQNRRHEMKPG I TGWAQVNGRNALS WDE KFS CDVW YTDNFS FWLDMKI L FLTVKKVL 180 

10 orf 3 IKEGISAQGEXTMPPFTGKKKIAWGAGGHGKWADLAAALGRYREIVFLDDRAQGSVNG 214 

MINIUM II 1 1 hi h I II hi I II II h hh 1 1 1 1 I lllllllhllllll 

orf 3ng IKEGISAQGEATMPPFAGNRKLAVIGAGGHGKWAELAAALGTYGEIVFLDDRTQGSVNG 24 0 

orf 3 FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT 2 74 

I 1 1 1 1 M i II 1 1 1 1 1 1 hhh h 1 1 1 1 1 1 h hi h 1 1 1 1 1 lllhllllllllll 

15 orf3ng FPVIGTTLLLENSLSPEQFDITVAVGNNRIRRQITENAAALGFKLPVLIHPDATVSPSAI 300 

orf3 VGQGSWMAKAV 286 
hhh I lh 

orf 3ng IGQGSWMAKAWQAGSVLKDGVIVNTAATVDHDCLLDAFVHISPGAHLSGNTRIGEESR 360 

20 The complete length ORF3ng nucleotide sequence [<SEQ ID 1 7>] (SEP ID NO: 17) is: 

1 ATGAGTAAAG CCGTCAAACG CCTGTTCGAC ATCATCGCAT CCGCATCGGG 

51 GCTGATTGTC CTGTCGCCCG TGTTTTTGGT TTTAATATAC CTCATCCGCA 

101 AAAACTTAGG TTCGCCCGTC TTCTTCattC GGGAACGCCc cgGAAAGGAc 

- 151 ggaaaacCTT TTAAAATGGT CAAATTCCGT TCCAtgcgcg acgcgcttGA 

25 201 TTCAGACGGC ATTCCGCTGC CCGATAGCGA ACGCCTGACC GATTTCGGCA 

251 AAAAATTACG CGCCACCAGT TTGGACGAAC TTCCTGAATT ATGGAATGTC 

301 CTCAAAGGCG AGATGAGCCT GGTCGGCCCC CGCCCGCTTT TGATGCAGTA 

351 TCTGCCGCTT TACAACAAAT TTCAAAACCG CCGCCACGAA ATGAAACCGG 

401 GCATTACCGG CTGGGCGCAG GTCAACGGGC GCAACGCGCT TTCGTGGGAC 

30 451 GAAAAGTTCT CCTGCGATGT TTGGTACACC GACAATTTCA GCTTTTGGCT 

501 GGATATGAAA ATCCTGTTTC TGACAGTCAA AAAAGTCTTG ATTAAAGAAG 

551 GCATTTCGGC GCAAGGGGAA GCCACCATGC CCCCTTTCGC GGGGAATCGC 

601 AAACTCGCCG TTATCGGCGC GGGCGGACAC GGCAAAGTCG TTGCCGAGCT 

651 TGCCGCCGCA CTCGGCACAT ACGGCGAAAT CGTTTTTCTG GACGACCGCA 

35 701 CCCAAGGCAG CGTCAACGGC TTCCCCGTCA TCGGCACGAC GCTGCTGCTT 

751 GAAAACAGTT TATCGCCCGA ACAATTCGAC ATCACCGTCG CCGTCGGCAA 

801 CAACCGCATC CGCCGCCAAA TCACCGAAAA CGCCGCCGCG CTCGGCTTCA 

851 AACTGCCCGT TCTGATTCAT CCCGACGCGA CCGTCTCGCC TTCTGCAATA 

901 ATCGGACAAG GCAGCGTCGT AATGGCGAAA GCCGTCGTAC AGGCCGGCAG 

40 951 CGTATTGAAA GACGGCGTGA TTGTGAACAC TGCCGCCACC GTCGATCACG 

1001 ACTGCCTGCT TGACGCTTTC GtccaCATCA GCCCGGGCGC GCACCTGTCG 

1051 GGCAACACGC GTATCGGCGA AGAAAGCCGG ATAGGCACGG GCGCGTGCAG 

1101 CCGCCAGCAG ACAACCGTCG GCAGCGGGGT TACCgccgGT GCAGGGgcGG 

1151 TTATCGTATG CGACATCCCG GACGGCATGA CCGTCGCGGG CAACCCGGCA 

45 1201 AAGCCCCTTA CGGGCAAAAA CCCCAAGACC GGGACGGCAT AA 

This encodes a protein having amino acid sequence [<SEQ ID 18>] (SEP ID NO: 18) : 

1 MSKAVKRLFD IIAS ASGLIV LSPVFLVLIY LI RKNLGSPV FFIRERPGKD 

51 GKPFKMVKFR SMRDALDSDG IPLPDSERLT DFGKKLRATS LDELPELWNV 

50 101 LKGEMSLVGP RPLLMQYLPL YNKFQNRRHE MKPGITGWAQ VNGRNALSWD 

151 E KFS CDVW YT DNFSFWLDMK I L FLTVKKVL IKEGISAQGE ATMPPFAGNR 
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2 01 KLAVIGAGGH GKWAELAAA LGTYGEIVFL DDRTQGSVNG FPVIGTTLLL 

251 ENSLSPEQFD ITVAVGNNRI RRQITENAAA LGFKLPVLIH PDATVSPSAI 

301 IGQGSWMAK AWQAGSVLK DGVIVNTAAT VDHDCLLDAF VHISPGAHLS 

351 GNTRIGEESR IGTGACSRQQ TT VGSGVTAG AGAVIVCDI P DGMTVAGNPA 

4 01 KPLTGKNPKT GTA* 



This protein shows 86.9% identity in 413 aa overlap with ORF3-1 (SEP ID NO: 14) : 



10 20 30 40 50 60 

orf 3 - 1 . pep MSKFFKRLFDIVASASGLIFLSPVFLILI YLIRKNLGSPVFFFQERPGKDGKPFKMVKFR 

III I I I I M M I I I I I ||||||:||||||IMIIIII|::|||||IIIIIIIIIII 
orf3ng MSKAVKRLFDI IASASGLIVLSPVFLVLIYLIRKNLGSPVFFIRERPGKDGKPFKMVKFR 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 3 - 1 . pep SMRDALDSDGIPLPDGERLTPFGKKLRAASLDELPELWNILKGEMSLVGPRPLLMQYLPL 
llllllllllll hhll I I I h hi I I I I II I I hi II I II I I I I I h I I I I I II 
orf3ng SMRDALDSDGIPLPDSERLTDFGKKLRATSLDELPELWNVLKGEMSLVGPRPLLMQYLPL 

70 80 90 100 110 120 



130 140 150 160 170 180 

orf 3 - 1 . pep YDNFQNRRHEMKPGI TGWAQVNGRNALS WDEKFACDVWY I DHFSLCLD I KI LLLTVKKVL 

-I II I I I II I I I I I I I I I I II II I i I I I I h I I I hlh I I : I I I = I I I I I I I 
orf 3ng YNKFQNRRHEMKPGITGWAQWGRNALSWDEKFSCDVWYTDNFSFWLDMKILFLTVKKVL 

130 140 150 160 170 180 



190 200 210 220 230 240 

orf 3 - 1 . pep IKEGISAQGEATMPPFTGKRKLAWGAGGHGKVVADLAAALGRYREIVFLDDRAQGSVNG 

lllllllllll MhMlllh IIIIIIIIMI II I Mill Mlllll 

orf3ng IKEGISAQGEATMPPFAGNRKLAVIGAGGHGKWAELAAALGTYGEIVFLDDRTQGSVNG 

190 200 210 220 230 240 



250 260 270 280 290 300 

orf 3 - 1 . pep FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT 

I I I I I M I I I I I I I I h I -I : I I I M I I I I hhll I I II I I I : I I I I I I I I I 
orf3ng FPVIGTTLLLENSLSPEQFDITVAVGNNRIRRQITENAAALGFKLPVLIHPDATVSPSAI 

250 260 270 280 290 300 



310 320 330 340 350 360 

orf 3 - 1 . pep VGQGSWMAKAWQAGSVLKDGVIVNTAATVDHDCLLNAFVHISPGAHLSGNTHIGEESW 

35 : | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | || | | : | | || I | I I I I I I I I I : I I I I I 

orf 3ng IGQGSVVMAKAWQAGSVLKDGVIWTAATVDHDCLLDAFVHISPGAHLSGNTRIGEESR 

310 320 330 340 350 360 

370 380 390 400 410 

or f 3 - 1 . pep IGTGACSRQQ I RIGSRATIGAGAVWRDVSDGMTVAGNPAKPLPRKNPETSTAX 

40 I I II II II I I :|| =1 II II hi h INI II I MINI hh hi 

orf 3ng IGTGACSRQQTTVGSGVTAGAGAVIVCDIPDGMTVAGNPAKPLTGKNPKTGTAX 

370 380 390 400 410 

In addition, ORF3ng (SEP ID NO: 18) shows significant homology with a hypothetical protein 
45 (SEPIDNP: 1110) from B.subtilis: 



gnl|PID|e238668 (Z71928) hypothetical protein [Bacillus subtilis] 
)gi | 1945702 |gnl | PID|e313004 (Z94043) hypothetical protein [Bacillus subtilis] 
)gi|2635938|gnl|PID|ell86113 (Z99121) similar to capsular polysaccharide 
biosynthesis [Bacillus subtilis] Length = 202 
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Score = 235 bits (594), Expect = 3e-61 

Identities = 114/195 (58%), Positives = 142/195 (72%) 







Query: 


5 


VKRLFD 1 1 AS ASGLI VLS PVFL VL I YL I RKNLGS PVFF I RERPGKDGKPFKMVKFRSMRD 


64 










+KRLFD+ A+ L S + L I ++R +GSPVFF + RPG GKPF + KFR+M D 




c 

J 




Sbjct : 


3 


LKRLFDLTAAIF LLiCCTSVl ILr 1 lAVVKljKHjbFVr r ls.yVKFtjijrHjlN.Ff llii K.r KlMi U 








Query : 


65 


ALDSDGIPLPDSERLTDFGKKLRATSLDELPELWNVLKGEMSLVGPRPLLMQYLPLYNKF 


124 










DS G LPD RLT G+ +R S+DELP+L NVLKG+ +SLVGPRPLLM YLPLY + 








Sbjct: 


63 


ERDSKGNLLPDEVRLTKTGRLIRKLSIDELPQLLNVLKGDLSLVGPRPLLMDYLPLYTEK 


122 






Query: 


125 


QNRRHEMKPG I TGWAQVNGRNALS WDEKFS CDVW YTDNFS FWLDMKI L FLTVKKVL I KEG 


184 


10 








Q RRHE + KPG I TGWAQ +NGRNA + S W + + KF DVWY DN+SF+LD+KIL LTV+KVL+ EG 








Sbjct: 


123 


QARRHEVKPGITGWAQINGRNAISWEKKFELDVWYVDNWSFFLDLKILCLTVRKVLVSEG 


182 






Query : 


185 


ISAQGEATMPPFAGN 199 












I T F G+ 








Sbjct : 


183 


IQQTNHVTAERFTGS 197 





15 The hypothetical product of yvfc gene shows similarity to EXOY of R.meliloti, an 
exopolysaccharide production protein. Based on this and on the two predicted transmembrane 
regions in the homologous N. gonorrhoeae sequence, it is predicted that these proteins, or their 
epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 4 

20 The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 19>] (SEP ID 
NO: 19) : 

1 . . AACCATATGG CGATTGTCAT CGACGAATAC GGCGGCACAT CCGGCTTGGT 

51 CACCTTTGAA GACATCATCG AGCAAATCGT CGGCGAAATC GAAGACGAGT 

101 TTGACGAAGA CGATAGCGCC GACAATATCC ATGCCGTTTC TTCAGACACG 

25 151 TGGCGCATCC ATGCAGCTAC CGAAATCGAA GACATCAACA CCTTCTTCGG 

2 01 CACGGAATAC AGCATCGAAG AAGCCGACAC CATT . GGCGG CCTGGTCATT 

2 51 CAAGAGTTGG GACATCTGCC CGTGCGCGGC GAAAAAGTCC TTATCGGCGG 

301 TTTGCAGTTC ACCGTCGCAC GCGCCGACAA CCGCCGCCTG CATACGCTGA 

351 TGGCGACCCG CGTGAAGTAA GC ACCGC CGTTTCTGCA 

30 4 01 CAGTTTAG 

This corresponds to amino acid sequence [<SEQ ID 20; ORF5>] (SEP ID NO: 20: ORF5) : 



1 . . NHMAIVIDEY GGTSGLVTFE DIIEQIVGEI EDEFDEDDSA DNIHAVSSDT 
51 WRIHAATEIE DINTFFGTEY SIEEADTIXR PGHSRVGTSA RARRKSPYRR 
35 101 FAVHRRTRRQ PPPAYADGDP REVS .... XR RFCTV* 



Further sequence analysis revealed the complete DNA sequence to be [<SEQ ID 21>] (SEP ID 
NO: 21): 
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1 ATGGACGGCG CACAACCGAA AACGAATTTT TTTGAACGCC TGATTGCCCG 

51 ACTCGCCCGC GAACCCGATT CCGCCGAAGA CGTATTAAAC CTGCTTCGGC 

101 AGGCGCACGA GCAGGAAGTT TTTGATGCGG ATACGCTTTT AAGATTGGAA 

151 AAAGTCCTCG ATTTTTCCGA TTTGGAAGTG CGCGACGCGA TGATTACGCG 

201 CAGCCGTATG AACGTTTTAA AAGAAAACGA CAGCATCGAG CGCATCACCG 

251 CCTACGTTAT CGATACCGCC CATTCGCGCT TCCCCGTCAT CGGCGAAGAC 

301 AAAGACGAAG TTTTGGGCAT TTTGCACGCC AAAGACCTGC TCAAATATAT 

351 GTTTAACCCC GAGCAGTTCC ACCTCAAATC CATTCTCCGC CCCGCCGTCT 

4 01 TCGTCCCCGA AGGCAAATCG CTGACCGCCC TTTTAAAAGA GTTCCGCGAA 

4 51 CAGCGCAACC ATATGGCGAT TGTCATCGAC GAATACGGCG GCACATCCGG 

501 CTTGGTCACC TTTGAAGACA TCATCGAGCA AATCGTCGGC GAAATCGAAG 

551 ACGAGTTTGA CGAAGACGAT AGCGCCGACA ATATCCATGC CGTTTCTTCC 

601 GAACGCTGGC GCATCCATGC AGCTACCGAA ATCGAAGACA TCAACACCTT 

651 CTTCGGCACG GAATACAGCA GCGAAGAAGC CGACACCATT CGGCCTGGTC 

701 ATTCAAGAGT TGGGACATCT GCCCGTGCGC GGCGAAAAAG TCCTTATCGG 

751 CGGTTTGCAG TTCACCGTCG CACGCGCCGA CAACCGCCGC CTGCATACGC 

801 TGATGGCGAC CCGCGTGAAG TAAGCACCGC CGTTTCTGCA CAGTTTAGGA 

851 TGACGGTACG GGCGTTTTCT GTTTCAATCC GCCCCATCCG CCAAACATAA 

This corresponds to amino acid sequence [<SEQ ID 22; ORF5-l>] (SEP ID NO: 22;PRF5-1) : 



1 MDGAQPKTNF FERLIARLAR EPDSAEDVLN LLRQAHEQEV FDADTLLRLE 

51 KVLDFSDLEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED 

101 KDEVLGILHA KDLLKYMFNP EQFHLKSILR PAVFVPEGKS LTALLKEFRE 

151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG EIEDEFDEDD SADNIHAVSS 

201 ERWRIHAATE IEDINTFFGT EYSSEEADTI RPGHSRVGTS ARARRKSPYR 

251 RFAVHRRTRR QPPPAYADGD PREVSTAVSA QFRMTVRAFS VSIRPIRQT* 

Further work identified the corresponding gene in strain A of N. meningitidis [<SEQ ID 23>] (SEQ 
ID NO: 23) : 



1 ATGGACGGCG CACAACCGAA AACAAATTTT TTNNAACGCC TGATTGCCCG 

51 ACTCGCCCGC GAACCCGATT CCGCCGAAGA CGTATTGACC CTGTTGCGCC 

101 AAGCGCACGA ACAGGAAGTA TTTGATGCGG ATACGCTTTT AAGATTGGAA 

151 AAAGTCCTCG ATTTTTCTGA TTTGGAAGTG CGCGACGCGA TGATTACGCG 

2 01 CAGCCGTATG AACGTTTTAA AAGAAAACGA CAGCATCGAA CGCATCACCG 
251 CCTACGTTAT CGATACCGCC CATTCGCGCT TCCCCGTCAT CGGTGAAGAC 

3 01 AAAGACGAAG TTTTGGGTAT TTTGCACGCC AAAGACCTGC TCAAATATAT 

3 51 GTTCAACCCC GAGCAGTTCC ACCTCAAATC GATATTGCGC CCTGCCGTCT 

4 01 TCGTCCCCGA AGGCAAATCG CTGACCGCCC TTTTAAAAGA GTTCCGCGAA 
4 51 CAGCGCAACC ATATGGCAAT CGTCATCGAC GAATACGGCG GCACGTCGGG 
501 TTTGGTAACT TTTGAAGACA TCATCGAGCA AATCGTCGGC GACATCGAAG 
551 ATGAGTTTGA CGAAGACGAA AGCGCGGACA ACATCCACGC CGTTTCCGCC 
601 GAACGCTGGC GCATCCACGC GGCTACCGAA ATCGAAGACA TCAACGCCTT 
651 TTTCGGCACG GAATACAGCA GCGAAGAAGC CGACACCATC GGCGGCCNTG 
701 GTCATTCAGG AATTGGNACA CCTGCCCGTG CGCGGCGAAA AAGTCNTTAT 
751 CGGCGNNTTG CANTTCACNG TCGCCNGCGC NGACAACCGC CGCCTGCATA 
801 CGCTGATGGC GACCCGCGTG AAGTAAGCTC CGCCGTTTCT GTACAGTTTA 
851 GGATGACGGT ACGGGCGTTT TCTGTTTCAA TCCGCCCCAT CCGCCANACA 
901 TAA 

This encodes a protein having amino acid sequence [<SEQ ID 24; ORF5a>] (SEP ID NO: 24; 
ORF5a) : 



1 MDGAQPKTNF XXRLIARLAR EPDSAEDVLT LLRQAHEQEV FDADTLLRLE 
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51 
101 
151 
201 
251 
301 



KVLDFSDLEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED 
KDEVLGILHA KDLLKYMFNP EQFHLKSILR PAVFVPEGKS LTALLKEFRE 
QRNHMAIVID EYGGTSGLVT FEDIIEQIVG DIEDEFDEDE SADNIHAVSA 
ERWRIHAATE IEDINAFFGT EYSSEEADTI GGXGHSGIGT PARARRKSXY 
RRXAXHXRXR XQPPPAYADG DPREVSSAVS VQFRMTVRAF SVSIRPIRXT 



The originally-identified partial strain B sequence (ORF5) (SEP ID NO: 20) shows 54.7% identity 
over a 1 24aa overlap with ORF5a (SEP ID NO: 24) : 



10 



15 



20 



25 



10 20 30 

orf 5 .pep NHMAIVIDEYGGTSGLVTFEDI IEQIVGEI 

I I I I M I I I I I I I I I I II I I I I I I I I h I 
orf 5a FHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVGDI 
130 140 150 160 170 180 

40 50 60 70 80 90 

orf 5 . pep EDEFDEDDSADNIHAVSSDTWRIHAATEIEDINTFFGTEYSIEEADTIXRPGHSRVGTSA 

Mill IM IIMIII I- I II Mill II III :|IMIM MINI Ml Ml I 

orf 5a EDEFDEDESADNIHAVSAERWRIHAATEIEDINAFFGTEYSSEEADTIGGXGHSGIGTPA 
190 200 210 220 230 240 

100 110 120 130 

orf 5 . pep RARRKS PYRRFAVHRRTRRQPPPAYADGDPREVSXXXXXRRFCTV 

MINI III I I hi IIIIIIIMIIIIII 

orf 5a RARRKSXYRRXAXHXRXRXQPPPAYADGDPREVSSAVSVQFRMTVRAFSVSIRPIRXTX 
250 260 270 280 290 300 

The complete strain B sequence (PRF5-1) (SEP ID NP: 22) and PRF5a (SEP ID NP: 24) show 
92.7% identity in 300 aa overlap: 



30 



35 



10 20 30 40 50 60 

or f 5a . pep MDGAQPKTNFXXRLIARLAREPDSAEDVLTLLRQAHEQEVFDADTLLRLE KVLDFSDLEV 

I II I | | I I I I I I I I' I I I I h i M I I M I I I I I I ■ I I I I I I i I I I I M M 

or f 5 - 1 MDGAQPKTNFFERLIARLAREPDSAEDVLNLLRQAHEQEVFDADTLLRLE KVLDFSDLEV 

10 20 30 40 50 60 

■70 80 90 100 110 120 

orf 5a . pep RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP 

IIIIIIIMIIIIMilllllhllllM llllll IMIM MMII IIIIIIM 

orf 5-1 RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP 

70 80 90 100 110 120 



40 



130 140 150 160 170 180 

orf 5a . pep EQFHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDI IEQIVG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 H 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 M II 1 1 

orf 5-1 EQFHLKS ILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDI IEQIVG 

130 140 150 160 170 180 



45 



190 200 210 220 230 240 

orf 5a. pep DIEDEFDEDESADNIHAVSAERWRIHAATEIEDINAFFGTEYSSEEADTIGGXGHSGIGT 

:| I I I I I M : I I I I II 1 I I :| I I I I I I I I I I I II hi I I I i M I I I I M I Ml : I I 
orf 5-1 EIEDEFDEDDSADNIHAVSSERWRIHAATEIEDINTFFGTEYSSEEADTIRP-GHSRVGT 

190 200 210 220 230 



250 



260 



270 



280 



290 



300 
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orf 5a . pep PARARRKSXYRRXAXHXRXRXQPPPAYADGDPREVSSAVSVQFRMTVRAFSVSIRPIRXT 

Illllll III I I hi II IIIIIIIIIMhIIMIIIIIII MIIIIM I 

orf 5-1 SARARRKSPYRRFAVHRRTRRQPPPAYADGDPREVSTAVSAQFRMTVRAFSVSIRPIRQT 
240 250 260 270 280 290 

Further work identified the a partial DNA sequence in N. gonorrhoeae [<SEQ ID 25>] (SEP ID 
NO: 25) which encodes a protein having amino acid sequence [<SEQ ID 26; ORF5ng>] (SEP ID 
NO: 26; ORF5ng) : 



1 MDGAQPKTNF FERLIARLAR EPDSAEDVLN LLRQAHEQEV FDADTLTRLE 

51 KVLDFAELEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED 

101 KDEVLGILHA KDLLKYMFNP EQFHLKSVLR PAVFVPEGKS LTALLKEFRE 

151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG DIEDEFDEDE SADDIHSVSA 

2 01 ERWRIHAATE IEDINAFFGT EYGSEEADTI RRLGHSGIGT PARARRKSPY 

251 RRFAVHRRPR RQPPPAHADG DPREVSRACP HRRFCTV* 

Further analysis revealed the complete gonococcal nucleotide sequence [<SEQ ID 27>] (SEP ID 
NO: 27) to be: 



1 ATGGACGGCG CACAACCGAA AACAAATTTT TTTGAACGCC TGATTGCCCG 

51 ACTCGCCCGC GAACCCGATT CCGCCGAAGA CGTATTAAAC CTGCTTCGGC 

101 AGGCGCACGA ACAGGAAGTT TTTGATGCCG ACACACTGAC CCGGCTGGAA 

151 AAAGTATTGG ACTTTGCCGA GCTGGAAGTG CGCGATGCGA TGATTACGCG 

201 CAGCCGCATG AACGTATTGA AAGAAAACGA CAGCATCGAA CGCATCACCG 

251 CCTACGTCAT CGATACCGCC CATTCGCGCT TCCCCGTCAT CGGCGAAGAC 

301 AAAGACGAAG TTTTGGGCAT TTTGCACGCC AAAGACCTGC TCAAATATAT 

3 51 GTTCAACCCC GAGCAGTTCC ACCTGAAATC CGTCTTGCGC CCTGCCGTTT 

4 01 TCGTGCCCGA AGGCAAATCT TTGACCGCCC TTTTAAAAGA GTTCCGCGAA 
4 51 CAGCGCAACC ATATGGCAAT CGTCATCGAC GAATACGGCG GCACGTCGGG 
501 TTTGGTCACC TTTGAAGACA TCATCGAGCA AATCGTCGGT GACATCGAAG 
551 ACGAGTTTGA CGAAGACGAA AGCGccgacg acatCCACTC cgTTTccgCC 
601 GAACGCTGGC GCATCCacgc ggctaCCGAA ATCGAAGaca TCAACGCCTT 
651 TTTCGGTACG GAatacggca gcgaagaagc cgacaccatc cggcggctTG 
701 GTCATTCAGG AATTGGGACA CCTGCCCGTG CGCGGCGAAA ^AAGTCCTTAt 
751 cggcgGTTTG Cagttcaccg tCGCCCGCGC CGACAACCGC CGCCTGCACA 
801 CGCTGATGGC GACCCGCGTG AAGTAAGCAG AGCCTGCCcg AccgccgttT 
851 CTGCacAGTT TAGGatgACG gtaCGGTCGT TTTCTGTTTC AATCCGCCCC 
901 ATCCGCCAAA CATAA 

This encodes a protein having amino acid sequence [<SEQ ED 28; ORFng-l>] (SEP ID NO: 28: 
ORF5ng-l) : 



1 MDGAQPKTNF FERLIARLAR EPDSAEDVLN LLRQAHEQEV FDADTLTRLE 

51 KVLDFAELEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED 

101 KDEVLGILHA KDLLKYMFNP EQFHLKSVLR PAVFVPEGKS LTALLKEFRE 

151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG DIEDEFDEDE SADDIHSVSA 

201 ERWRIHAATE IEDINAFFGT EYGSEEADTI RRLGHSGIGT PARARRKSPY 

251 RRFAVHRRPR RQPPPAHADG DPREVSRACP TAVSAQFRMT VRSFSVSIRP 

3 01 IRQT* 



The originally-identified partial strain B sequence (PRF5) (SEP ID NP: 20) shows 83.1% identity 
over a 1 35aa overlap with the partial gonococcal sequence (PRF5ng) (SEP ID NP: 26) : 
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5 



orf 5 


NHMA I V I DE YGGTSGLVTFED I I EQ I VGE I 




i i i i i i 1 1 1 1 1 1 t i i i i i i i 1 1 1 1 1 1 1 1 . 1 
II 1 1 1 II II II II 1 II II II II II 1 M 1 : 1 


orf 5ng 


FHLKSVLRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVGDI 


orf5 


EDEFDEDDSADNIHAVSSDTWRIHAATEIEDINTFFGTEYSIEEADTIXRPGHSRVGTSA 




Mill i : 1 1 1 : 1 1 : I 1 ^ : 1 1 1 1 1 M II 1 1 1 M 1 1 1 1 M II II II | III :|| 1 


orf 5ng 


EDEFDEDESADDIHSVSAERWRIHAATEIEDINAFFGTEYGSEEADTIRRLGHSGIGTPA 


orf 5 


RARRKSPYRRFAVHRRTRRQPPPAYADGDPREVSX RRFCTV 131 




Mill MINIUM llllllhlllllllll Mill 


orf 5ng 


RARRKSPYRRFAVHRRPRRQPPPAHADGDPREVSRACPHRRFCTV 287 



10 

The complete strain B and gonococcal sequences (ORF5-1 & ORF5ng-l) (SEP ID NO: 22 & SEP 
DP NP: 28) show 92.4% identity in 304 aa overlap: 



10 20 30 40 50 60 

MDGAQPKTNFFERLIARLAREPDSAEDVLNLLRQAHEQEVFDADTLTRLEKVLDFAELEV 

1 1 II II II II III MM II III II INI II I II MM III II II II Ml II IM- MM 

MDGAQPKTNFFERLIARLAREPDSAEDVLNLLRQAHEQEVFDADTLLRLEKVLDFSDLEV 
10 20 30 40 50 60 

70 80 90 100 110 120 

PJDAM I TRS PJVINVLKENDS I ER I TAYV I 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP 

70 80 90 100 110 120 

130 140 150 160 170 180 

EQFHLKSVLRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDI IEQIVG 

I I I I ! I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I 
EQ FHLKS I LRPAVFVPEGKSLTALLKEFREQRNHMAI VIDE YGGTSGLVTFED I IEQIVG 

130 140 150 160 170 180 

190 200 210 220 230 240 

DIEDEFDEDESADDIHSVSAERWRIHAATEIEDINAFFGTEYGSEEADTIRRLGHSGIGT 

: I I I I I I I I : I I I = I I : I I = I I I I I I I I I I I I I I I = I I I I I I : t 1 I I I I I I Ml Ml 
EIEDEFDEDDSADNIHAVSSERWRIHAATEIEDINTFFGTEYSSEEADTIRP-GHSRVGT 

190 200 210 220 230 

250 260 270 280 290 300 

PARARRKSPYRRFAVHRRPRRQPPPAHADGDPREVSRACPTAVSAQFRMTVRS FSVSIRP 

IIIIIIIMIIIIIMI MIIIMM lllllll II M I M 1 1 1 1 1 M 1 1 1 1 1 1 

SARARRKSPYRRFAVHRRTRRQPPPAYADGDPREVS TAVSAQFRMTVRAFSVS IRP 

240 250 260 270 280 290 



orf 5ng-l .pep IRQTX 

40 | I 1 1| 

orf 5-1 IRQTX 
300 

Computer analysis of these amino acid sequences indicates a putative leader sequence, and 
45 identified the following homologies: 



orf 5ng-l .pep 

15 

orf 5-1 



orf 5ng-l .pep 

20 

orf 5-1 



orf 5ng-l .pep 

25 

orf 5-1 



orf 5ng-l . pep 

30 

orf 5-1 



orf 5ng-l .pep 

35 

orf 5-1 



Homology with hemolysin homolog TlvC (accession U32716) (SEP ID NO: 1111) of H.influenzae 
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ORF5 (SEP ID NO: 20) and TlyC proteins (SEP ID NO: 1111) show 58% aa identity in 77 aa 
overlap (BLASTp). 

ORF5 2 HMAIVIDEYGGTSGLVTFEDIIEQIVGEIEDEFDEDDSADNIHAVSSDTWRIHAATEIED 61 
HMAIV+DE+G SGLVT EDI+EQIVG+IEDEFDE++ AD I +S T+ + A T+I+D 
5 TlyC 166 HMAIWDEFGAVSGLVTIEDILEQIVGDIEDEFDEEEIAD- IRQLSRHTYAVRALTDIDD 224 

0RF5 62 INTFFGTEYSIEEADTI 78 

N F T++ EE DTI 
TlyC 225 FNAQFNTDFDDEEVDTI 241 

10 ORF5ng-l (SEP ID NO: 28) also shows significant homology with TlyC (SEPIDNP: 1111) : 

SCORES Initl: 301 Initn: 419 Opt: 668 

Smith-Waterman score: 668; 45.9% identity in 242 aa overlap 

10 20 30 40 50 

orf5ng-l .pep MDGAQPKTNFFERLIARLAR-EPDSAEDVLNLLRQAHEQEVFDADTLTRLEK 
15 I ||: |::|: : | : | ■=:: = :: | :::::::: | :| :| 

tlyc_haein MNDEQQNSNQSENTKKPFFQSLFGRFFQGELKNREELVEVIRDSEQNDLIDQNTREMIEG 

10 20 30 40 50 60 

60 70 80 90 100 109 

orf 5ng-l .pep VLDFAELEVRDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGE- -DKDEVLGILH 

20 |:::|||:||l II 11 = = =:: = :::: = I = = I I I I I I I I = = l = l = = = llll 

tlyc_haein VMEIAELRVRDIMIPRSQIIFIEDQQDLNTCLNTIIESAHSRFPVIADADDRDNIVGILH 

70 80 90 100 110 120 

110 120 130 140 150 160 

orf 5ng-l .pep AKDLLKYMF-NPEQFHLKSVLRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGL 

25 111111 = = = I I 1 = 1 = 111 = 1 = 111 = 1 = =11 = 11 =1 I II I 1 = 11 = l = = l I I 

tlyc_haein AKDLLKFLREDAEVFDLSSLLRPWIVPESKRVDRMLKDFRSERFHMAIWDEFGAVSGL 

130 140 150 160 170 180 

170 180 190 200 210 220 

orf 5ng-l .pep VTFEDIIEQIVGDIEDEFDEDESADDIHSVSAERWRIHAATEIEDINAFFGTEYGSEEAD 

30 11 = 111 = 111 II lllllllhl II h = s| = = = = l 1 = 1 = 1 = 11 l = l = = =1 hi 

tlyc_haein VTIEDILEQIVGDIEDEFDEEEIAD- IRQLSRHTYAVRALTDIDDFNAQFNTDFDDEEVD 

190 200 210 220 230 

230 240 250 260 270 280 

orf 5ng- 1 . pep TIRRLGHSGIG-TPARARRKSPYRRFAVHRRPRRQPPPAHADGDPREVSRACPTAVSAQF 

35 II I = =1 I 1= 

tlyc_haein TIGGLIMQTFGYLPKRGEEIILKNLQFKVTSADSRRLIQLRVTVPDEHLAEMNNVDEKSE 

240 250 260 270 280 290 



Homology with a hypothetical secreted protein from E.coli: 



40 PRF5a (SEP ID NP: 24) shows homology to a hypothetical secreted protein (SEP ID NP: 1112) 



from E.coli: 



CHIR-0160 (356.001) PATENT 

-94- 

sp|P773 92 |YBEX_ECOLI HYPOTHETICAL 33.3 KD PROTEIN IN CUTE-ASNB INTERGENIC REGION 
)gi | 1778577 (U82598) similar to H. influenzae [Escherichia coli] )gi| 1786879 
(AE000170) f292; This 292 aa ORF is 23% identical (9 gaps) to 272 residues of an 
approx. 440 aa protein YTFL_HAEIN SW: P44717 [Escherichia coli] Length = 292 

5 Score = 212 bits (533), Expect = 3e-54 

Identities = 112/230 (48%), Positives = 149/230 (64%), Gaps = 3/230 (1%) 

Query: 2 DGAQPKTNFXXRLIARLAR-EPDSAEDVLTLLRQAHEQEVFDADTLLRLEKVLDFSDLEV 60 

D K F L+++L EP + +++L L+R + + ++ D DT LE V+D +D V 
Sbjct: 10 DTISNKKGFFSLLLSQLFHGEPKNRDELLALIRDSGQNDLIDEDTRDMLEGVMDIADQRV 69 

10 Query: 61 RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYM-FN 119 

RD MI RS+M LK N + + + ' + 1 + +AHSRFPVI EDKD + GIL AKDLL +M + 
Sbjct : 70 RDIM I PRSQM ITLKRNQTLDECLDVI I ES AHSRFPVISEDKDHI EGI LMAKDLLPFMRSD 129 

Query: 120 PEQFHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIV 179 
E F + +LR AV VPE K + +LKEFR QR HMAIVIDE+GG SGLVT EDI+E IV 
15 Sbjct: 130 AEAFSMDKVLRQAVWPESKRVDRMLKEFRSQRYHMAIVIDEFGGVSGLVTIEDILELIV 189 

Query: 180 GDIEDEFDEDESADNIHAVSAERWRIHAATEIEDINAFFGTEYSSEEADT 229 

G+IEDE+DE++ D +S W + A I ED N FGT +S EE DT 
Sbjct: 190 GE I EDEYDEEDD ID - FRQLSRHTWTVRALAS I EDFNEAFGTHFSDEEVDT 238 

20 Based on this analysis, including the amino acid homology to the TlyC hemolysin-homologue from 
K influenzae (hemolysins are secreted proteins), it was predicted that the proteins from 
N. meningitidis and N. gonorrhoeae are secreted and could thus be useful antigens for vaccines or 
diagnostics. 

ORF5-1 (SEP ID NO: 22) (30.7kDa) was cloned in the pGex vector and expressed in Exoli, as 
25 described above. The products of protein expression and purification were analyzed by SDS- 
PAGE. Figure 2A shows the results of affinity purification of the GST-fusion protein. Purified 
GST-fusion protein was used to immunise mice, whose sera were used for Western blot analysis 
(Figure IB). These experiments confirm that ORF5-1 (SEP ID NO: 22) is a surface-exposed 
protein, and that it is a useful immunogen. 

30 Example 5 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 29>] (SEP ID 
NO: 29) : 

1 ATGCGCGGCG GCAGGCCGGA TTCCGTTACC GTGCAGATTA TCGAAGGTTC 

51 GCGTTTTTCG CATATGAGGA AAGTCATCGA CGCAACGCCC GACATCGGAC 

35 101 ACGACACCAA AGGCTGGAGC AATGAAAAAC TGATGGCGGA AGTTGCGCCC 

151 GATGCCTTCA GCGGCAATCC TGAAgGGCAG TTTTTCCCCG ACAGCTACGA 

201 AATCGATGCG GGCGGCAGTG ATTTGCAGAT TTACCAAACC GCCTACAAgG 
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251 GCGATGCAAC GCCGCCTGAA TGAgGGCATG GGAAAGCAGG CAGGACGGGC 

301 TGCCTTATAA AAACCCTTAT GAAATGCTGA TTATGGCGAr CCTGGTCGAA 

351 AAGGAAACAG GGCATGAAGC CGAsCsCGAC CATGTcGCTT CCGTCTTCGT 

4 01 CAACCGCCTG AAAATCGGTA TGCGCCTGCA AACCgAssCG TCCGTGATTT 

451 ACGGCATGGG TGCGGCATAC AAGGGCAAAA TCCGTAAAGC CGACCTGCGC 

501 CGCGACACGC CGTACAACAC CTACACGCGC GGCGGTCTGC. CGCCAACCCC 

551 GATTGCGCTG CCC. 

This corresponds to the amino acid sequence [<SEQ ID 30; ORF7>] (SEP ID NO: 30; ORF7) : 



1 MRGGRPDSVT VQIIEGSRFS HMRKVIDATP DIGHDTKGWS NEKLMAEVAP 

51 DAFSGNPEGQ FFPDSYEIDA GGSDLQIYQT AYKAMQRRLN EAWESRQDGL 

101 PYKNPYEMLI MAXLVEKETG HEAXXDHVAS VFVNRLKIGM RLQTXXSVIY 

151 GMGAAYKGKI RKADLRRDTP YNTYTRGGLP PTPIALP.. 

Further sequence analysis revealed the complete DNA sequence [<SEQ ID 31>] (SEP ID NO: 31) : 



1 ATGTTGAGAA AATTGTTGAA ATGGTCTGCC GTTTTTTTGA CCGTGTCGGC 

51 AGCCGTTTTC GCCGCGCTGC TTTTTGTTCC TAAGGATAAC GGCAGGGCAT 

101 ACCGAATCAA AATTGCCAAA AACCAGGGTA TTTCGTCGGT CGGCAGGAAA 

151 CTTGCCGAAG ACCGCATCGT GTTCAGCAGG CATGTTTTGA CGGCGGCGGC 

201 CTACGTTTTG GGTGTGCACA ACAGGCTGCA TACGGGGACG TACAGATTGC 

251 CTTCGGAAGT GTCTGCTTGG GATATCTTGC AGAAAATGCG CGGCGGCAGG 

301 CCGGATTCCG TTACCGTGCA GATTATCGAA GGTTCGCGTT TTTCGCATAT 

351 GAGGAAAGTC ATCGACGCAA CGCCCGACAT CGGACACGAC ACCAAAGGCT 

4 01 GGAGCAATGA AAAACTGATG GCGGAAGTTG CGCCCGATGC CTTCAGCGGC 

451 AATCCTGAAG GGCAGTTTTT CCCCGACAGC TACGAAATCG ATGCGGGCGG 

501 CAGTGATTTG CAGATTTACC AAACCGCCTA CAAGGCGATG CAACGCCGCC 

551 TGAATGAGGC ATGGGAAAGC AGGCAGGACG GGCTGCCTTA TAAAAACCCT 

601 TATGAAATGC TGATTATGGC GAGCCTGGTC GAAAAGGAAA CAGGGCATGA 

651 AGCCGACCGC GACCATGTCG CTTCCGTCTT CGTCAACCGC CTGAAAATCG 

701 GTATGCGCCT GCAAACCGAC CCGTCCGTGA TTTACGGCAT GGGTGCGGCA 

751 TACAAGGGCA AAATCCGTAA AGCCGACCTG CGCCGCGACA CGCCGTACAA 

801 CACCTACACG CGCGGCGGTC TGCCGCCAAC CCCGATTGCG CTGCCCGGCA 

851 AGGCGGCACT CGATGCCGCC GCCCATCCGT CCGGCGAAAA ATACCTGTAT 

901 TTCGTGTCCA AAATGGACGG CACGGGCTTG AGCCAGTTCA GCCATGATTT 

951 GACCGAACAC AATGCCGCCG TCCGCAAATA TATTTTGAAA AAATAA 

This corresponds to the amino acid sequence [<SEQ ID 32; ORF7-l>] (SEP ID NO: 32:ORF7-l) : 



1 MLRKLLKWSA VFLTVSAAVF A ALLFVPKDN GRAYRIKIAK NQGISSVGRK 

51 LAEDRIVFSR HVLTAAAYVL GVHNRLHTGT YRLPSEVSAW DILQKMRGGR 

101 PDSVTVQIIE GSRFSHMRKV IDATPDIGHD TKGWSNEKLM AEVAPDAFSG 

151 NPEGQFFPDS YEIDAGGSDL QIYQTAYKAM QRRLNEAWES RQDGLPYKNP 

201 YEMLIMASLV EKETGHEADR DHVASVFVNR LKIGMRLQTD PSVIYGMGAA 

251 YKGKIRKADL RRDTPYNTYT RGGLPPTPIA LPGKAALDAA AHPSGEKYLY 

3 01 FVSKMDGTGL SQFSHDLTEH NAAVRKYILK K* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with hypothetical protein encoded by vceg gene (accession P44270) (SEP ID NO: 
1113) of H.influenzae 
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ORF7 (SEP ID NO: 30) and yceg proteins (SEP ID NO: 1113) show 44% aa identity in 192 aa 
overlap: 

ORF7 1 MRGGRPDSVTVQIIEGSRFSHMRKVIDATPDIGHDTKGWSNEKLMA EVAPDAFSG 55 

+ G+ V+ I EG F RK ++ P + K SNE++ A ++ + 

5 yceg 102 LNSGKEVQFNVKWIEGKTFKDWRKDLENAPHLVQTLKDKSNEEIFALLDLPDIGQNLELK 161 

ORF7 56 NPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWESRQDGLPYKNPYEMLIMAXLV 115 

N EG +PD+Y +DL++ + + + M++ LN+AW R + LP NPYEMLI+A +V 

yceg 162 NVEGWLYPDTYNYTPKSTDLELLKRSAERMKKALNKAWNERDEDLPLANPYEMLILASIV 221 

ORF7 116 EKETGHEAXXDHVASVFVNRLKIGMRLQTXXSVIYGMGAAYKGKIRKADLRRDTPYNTYT 175 
10 EKETG VASVF+NRLK M+LQT +VIYGMG Y G IRK DL TPYNTY 

yceg 222 EKETGIANERAKVASVFINRLKAKMKLQTDPTVIYGMGENYNGNIRKKDLETKTPYNTYV 281 

ORF7 176 RGGLPPTPIALP 187 

GLPPTPIA+P 
yceg 282 IDGLPPTPIAMP 293 



15 



The complete length YCEG protein (SEPIDNP: 1113) has sequence: 



1 MKKFLIAILL LILILAGVAS FS YYKMTEFV KTPVNVQADE LLTIERGTTS 

51 SKLATLFEQE KLIADGKLLP YLLKLKPELN KIKAGTYSLE NVKTVQDLLD 

101 LLNSGKEVQF NVKWIEGKTF KDWRKDLENA PHLVQTLKDK SNEEIFALLD 

20 151 LPDIGQNLEL KNVEGWLYPD TYNYTPKSTD LELLKRSAER MKKALNKAWN 

201 ERDEDLPLAN PYEMLILASI VEKETGIANE RAKVASVFIN RLKAKMKLQT 

251 DPTVIYGMGE NYNGNIRKKD LETKTPYNTY VIDGLPPTPI AMPSESSLQA 

301 VANPEKTDFY YFVADGSGGH KFTRNLNEHN KAVQEYLRWY RSQKNAK 

Homology with a predicted PRF from N.meninsitidis (strain A) 

25 PRF7 rSEP ID NP: 30) shows 95.2% identity over a 187aa overlap with an PRF (PRF7a) (SEP 
ID NP: 34) from strain A of N. meningitidis: 

10 20 30 

orf7 pep MRGGRPDSVTVQI IEGSRFSHMRKVIDATP 

IMIIIIIIM IMMIII llllllll' 

30 orf 7a AAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQI IEGSRFSHMRKVIDATP 

70 80 90 100 110 120 

40 50 60 70 80 90 ^ 

orf 7 . pep DIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRRLN 

II I illlllll MIIIIIMIIIIIIII IIIMIIIIIIMI llllllllll 

35 orf 7a DIEHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLRIYQIAYKAMQRRLN 

130 140 150 160 170 180 

100 110 120 130 140 150 

orf 7 . pep EAWESRQDGLPYKNPYEMLIMAXLVEKETGHEAXXDHVASVFVNRLKIGMRLQTXXSVIY 

M 1 1 1 , M 1 1 1 1 II 1 1 1 M I II hllllllll MIIIIIIIIIIIIIMM MM 

40 orf 7a EAWESRQDGLPYKNPYEMLIMASLIEKETGHEADRDHVASVFVNRLKIGMRLQTDPSVI Y 

190 200 210. 220 230 240 



160 170 180 
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orf 7 . pep GMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALP 

I I I I I I I ' I I II I I I I I I I I I I I I II II I M I I , 
orf 7a GMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLYFVSKM 

250 260 270 280 290 300 

orf 7a DGTGLSQFSHDLTEHNAAVRKYILKKX 
310 : 320 330 

The complete length ORF7a nucleotide sequence [<SEQ ID 33>] (SEP ID NO: 33) is: 

1 ATGTTGAGAA AATTGTTGAA ATGGTCTGCC GTTTTTTTGA CCGTATCGGC 

51 AGCCGTTTTC GCCGCGCTGC TTTTCGTCCC TAAAGACAAC GGCAGGGCAT 

101 ACAGGATTAA AATTGCCAAA AACCAGGGTA TTTCGTCGGT CGGCAGGAAA 

151 CTTGCCGAAG ACCGCATCGT GTTCAGCAGG CATGTTTTGA CGGCGGCGGC 

201 CTACGTTTTG GGTGTGCACA ACAGGCTGCA TACGGGGACG TACAGACTGC 

251 CTTCGGAAGT GTCTGCTTGG GATATCTTGC AGAAAATGCG CGGCGGCAGG 

301 CCGGATTCCG TTACCGTGCA GATTATCGAA GGTTCGCGTT TTTCGCATAT 

3 51 GAGGAAAGTC ATCGACGCAA CGCCCGACAT CGAACACGAC ACCAAAGGCT 

401 GGAGCAATGA AAAACTGATG GCGGAAGTTG CCCCTGATGC CTTCAGCGGC 

451 AATCCTGAAG GGCAGTTTTT CCCCGACAGC TACGAAATCG ATGCGGGCGG 

501 CAGCGATTTA CGGATTTACC AAATCGCCTA CAAGGCGATG CAACGCCGAC 

551 TGAATGAGGC ATGGGAAAGC AGGCAGGACG GGCTGCCTTA TAAAAACCCT 

601 TATGAAATGC TGATTATGGC GAGCCTGATC GAAAAGGAAA CAGGGCATGA 

651 AGCCGACCGC GACCATGTCG CTTCCGTCTT CGTCAACCGC CTGAAAATCG 

701 GTATGCGCCT GCAAACCGAC CCGTCCGTGA TTTACGGCAT GGGTGCGGCA 

751 TACAAGGGCA AAATCCGTAA AGCCGACCTG CGCCGCGACA CGCCGTACAA 

801 CACCTACACG CGCGGCGGTC TGCCGCCAAC CCCGATCGCG CTGCCCGGCA 

851 AGGCGGCACT CGATGCCGCC GCCCATCCGT CCGGTGAAAA ATACCTGTAT 

901 TTCGTGTCCA AAATGGACGG TACGGGCTTG AGCCAGTTCA GCCATGATTT 

951 GACCGAACAC AACGCCGCCG TTCGCAAATA TATTTTGAAA AAATAA 

This is predicted to encode a protein having amino acid sequence [<SEQ ID 34>] (SEP ID NO: 
341: 

1 MLRKLLKWSA VFLTVSAAVF AA LLFVPKDN GRAYRIKIAK NQGISSVGRK 

51 LAEDRIVFSR HVLTAAAYVL GVHNRLHTGT YRLPSEVSAW DILQKMRGGR 

101 PDSVTVQIIE GSRFSHMRKV IDATPDIEHD TKGWSNEKLM AEVAPDAFSG 

151 NPEGQFFPDS YEIDAGGSDL RIYQIAYKAM QRRLNEAWES RQDGLPYKNP 

201 YEMLIMASLI EKETGHEADR DHVASVFVNR LKIGMRLQTD PSVIYGMGAA 

251 YKGKIRKADL RRDTPYNTYT RGGLPPTPIA LPGKAALDAA AHPSGEKYLY 

301 FVSKMDGTGL SQFSHDLTEH NAAVRKYILK K* 

A leader peptide is underlined. 

ORF7a (SEP ID NO: 34) and PRF7-1 (SEP ID NO: 32) show 98.8% identity in 331 aa overlap: 

10 20 30 40 50 60 

orf 7a . pep MLRKLLKWSAVFLTVSAAVFAALLFVPKDNGRAYRIKIAKNQGISSVGRKLAEDRIVFSR 
i I I I I I M I M I I I I I I II li I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 7-1 MLRKLLKWSAVFLTVSAAVFAALLFVPKDNGRAYRIKIAKNQGISSVGRKLAEDRIVFSR 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 7a. pep HVLTAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKV 

L I ! 1 1 k 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 !! 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
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or f 7-1 HVLTAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQI IEGSRFSHMRKV 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 7a . pep IDATPDIEHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLRIYQIAYKAM 

5 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 II I M 1 1 1 1 1 M I II M 1 1 1 1 1 1 1 1 1 1 1 1 Ml hi 1 1 Mill 

orf 7-1 IDATPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAM 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 7a . pep QRRLNEAWESRQDGLPYKNPYEMLIMASLIEKETGHEADRDHVASVFVNRLKIGMRLQTD 

10 I I I I I 1 II I I I I I I 1 I I I I I I I I I I I I I h I I I I I I I I I I ! I I I II I I I I I I I II I I ; 

orf 7-1 QRRLNEAWESRQDGLPYKNPYEMLIMASLVEKETGHEADRDHVASVFVNRLKIGMRLQTD 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 7a . pep PSVI YGMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLY 

15 | M | ! I I I I I I II I I I I I I I I I . I I M I M I I I I II I I I I I I M I I I I I I I I I I I h 

orf 7-1 PSVIYGMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLY 

250 260 270 280 290 300 

310 320 330 

orf 7a .pep FVSKMDGTGLSQFSHDLTEHNAAVRKYILKKX 

20 | | | | M I I I I I I I M I I I I I I I I I I I I I I I I I 

orf 7- 1 FVSKMDGTGLSQFSHDLTEHNAAVRKYILKKX 

310 320 330 

Homology with a predicted ORF from N. gonorrhoeae 

ORF7 (SEP ID NO: 30) shows 94.7% identity over a 187aa overlap with a predicted ORF 
25 (ORF7.ng) (SEP ID NO: 36) from N. gonorrhoeae: 

orf 7 MRGGRPDSVTVQIIEGSRFSHMRKVIDATPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQ 6 0 

1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 

orf 7ng MRGGRPDSVTVQIIEGSRFSHMRKVIDATPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQ 60 

orf 7 FFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWESRQDGLPYKNPYEMLIMAXLVEKETG 120 

30 | | | | | | | | | | | | | | | | | | | | | | | | | | | || | | | | : I | I I I I I I I I I I I I I I I h I I I I I 

orf 7ng FFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWAGRQDGLPYKNPYEMLIMASLIEKETG 120 

or f 7 HEAXXDHVASVFVNRLKIGMRLQTXXSVI YGMGAAYKGKIRKADLRRDTPYNTYTRGGLP 180 

Ml I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I 
orf 7ng HEADRDHVASVFVNRLKIGMRLQTDPSVIYGMGAAYKGKIRKADLRRDTPYNTYTGGGLP 180 

35 orf7 PTPIALP 187 

II 1 1 1 1 

orf 7ng PTRIALPGKAAMDAAAHPSGEKYLYFVSKMDGTGLSQFSHDLTEHNAAVRKYILKK 236 

An PRF7ng nucleotide sequence [<SEQ ID 35>] (SEP ID NO: 35) is predicted to encode a protein 
40 having amino acid sequence [<SEQ ID 36>] (SEP ID NO: 36) : 



1 MRGGRPDSVT VQIIEGSRFS HMRKVIDATP DIGHDTKGWS NEKLMAEVAP 
51 DAFSGNPEGQ FFPDSYEIDA GGSDLQIYQT AYKAMQRRLN EAWAGRQDGL 
101 PYKNPYEMLI MASLIEKETG HEADRDHVAS VFVNRLKIGM RLQTDPSVIY 
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151 GMGAAYKGKI RKADLRRDTP YNTYTGGGLP PTRIALPGKA AMDAAAHPSG 
201 EKYLYFVSKM DGTGLSQFSH DLTEHNAAVR KYILKK* 



Further sequence analysis revealed a partial DNA sequence of ORF7ng [<SEQ ID 37>] (SEP ID 
NO: 37) : 



1 . . taccgaatca AGATTGCCAA AAATCAGGGT ATTTCGTCGG TCGGCAGGAA 

51 ACTTGCcgaA GACCGCATCG TGTTCAGCAG GCATGTTTTG ACAGCGGCGG 

101 CCTACGTTTT GGGTGTGCAC AACAGGCTGC ATACGGGGAC gTACAGATTG 

151 CCTTCGGAAG TGTCTGCTTG GGATATCTTG CAGAAAATGC GCGGCGGCAG 

201 GCCGGATTCC GTTACCGTGC AGATTATCGA AGGTTCGCGT TTTTCGCATA 

251 TGAGGAAAGT CATCGACGCA ACGCCCGACA TCGGACACGA CACCAAAGGC 

301 TGGAGCAATG AAAAACTGAT GGCGGAAGTT GCGCCCGATG CCTTCAGCGG 

351 CAATCCTGAA GGGCAGTTTT TTCCCGACAG CTACGAAATC GATGCGGGCG 

4 01 GCAGCGATTT GCAGATTTAC CAAACCGCCT ACAAGGCGAT GCAACGCCGC 

4 51 CTGAACGAGG CATGGGCAGG CAGGCAGGAC GGGCTGCCTT ATAAAAACCC 

501 TTATGAAATG CTGATTATGG CGAGCCTGAT CGAAAAGGAA ACGGGGCATG 

551 AGGCCGACCG CGACCATGTC GCTTCCGTCT TCGTCAACCG CCTGAAAATC 

601 GGTATGCGCC TGCAAACCGA CCCGTCCGTG ATTTACGGCA TGGGTGCGGC 

651 ATACAAGGGC AAAATCCGTA AAGCCGACCT GCGCCGCGAC ACGCCGTACA 

701 aCAccTAtac gggcgggggc ttgccgccaa cccggattgc gctgcccggC 

751 Aaggcggcaa tggatgccgc cgcccacccg tccggcgaAa aatacctgTa 

801 tttcgtgtcC AAAATGGACG GCACGGGCTT GAGCCAGTTC AGCCATGATT 

851 TGACCGAACA CAACGCCGCc gTcCGCAAAT ATATTTTGAA AAAATAA 



This corresponds to the amino acid sequence [<SEQ ID 38; ORFng-l>] (SEP ID NO: 38; 
PRF7ng-l) : 



1 . . YRIKIAKNQG ISSVGRKLAE DRIVFSRHVL TAAAYVLGVH NRLHTGTYRL 

51 PSEVSAWDIL QKMRGGRPDS VTVQIIEGSR FSHMRKVIDA TPDIGHDTKG 

101 WSNEKLMAEV APDAFSGNPE GQFFPDSYEI DAGGSDLQIY QTAYKAMQRR 

151 LNEAWAGRQD GLPYKNPYEM LIMASLIEKE TGHEADRDHV ASVFVNRLKI 

2 01 GMRLQTDPSV IYGMGAAYKG KIRKADLRRD TPYNTYTGGG LPPTRIALPG 

251 KAAMDAAAHP SGEKYLYFVS KMDGTGLSQF SHDLTEHNAA VRKYILKK* 



PRF7ng-l (SEP ID NO: 38) and ORF7-1 (SEP ID NO: 32) show 98.0% identity in 298 aa 
overlap: 



10 20 30 40 50 60 

orf 7- 1 . pep KLLKWSAVFLTVSAAVFAALLFVPKDNGRAYRIKIAKNQGISSVGRKLAEDRIVFSRHVL 

IIIIIMIIIIIIIIIIIIIIIIIIIIIII 
orf 7ng-l YRIKIAKNQGISSVGRKLAEDRIVFSRHVL 

10 20 , 30 



70 80 90 100 110 120 

orf 7 - 1 . pep TAAAYVLGVHNRLHTGT YRL PS E VS AWD I LQ KMRGGR PDS VTVQ I IEGSRFSHMRKVIDA 

I illlllMIIIIIIIMIM IIIIIIIIMIIIIMIMIIM MIIIIIMM 

orf 7ng- 1 TAAAYVLGVHNRLHTGTYRLPSEVSAWD I LQKMRGGRPDSVTVQI IEGSRFSHMRKVIDA 

40 50 60 70 80 90 



130 140 150 160 170 180 

orf 7- 1 . pep TPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRR 

II M 1 1 1 M 1 1 1 1 1 1 1 M I M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 7ng-l TPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRR 
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100 



110 



120 



130 



140 



150 



10 



190 200 210 220 230 240 

orf 7 - 1 . pep LNEAWESRQDGLPYKNPYEMLIMASLVEKETGHEADRDHVASVFWRLKIGMRLQTDPSV 

Mill : 1 1 1 1 1 M 1 1 1 II 1 1 1 M I hill 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 .1 1 1 ! i 

orf 7ng-l LNEAWAGRQDGLPYKNPYEMLIMASLIEKETGHEADRDHVASVFVNRLKIGMRLQTDPSV 

160 170 180 190 200 210 

250 260 270 280 290 300 

orf 7 - 1 . pep I YGMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLYFVS 

IIIIMIIIIIIIIIIIIIIIIIIIII Mill I I I I I I I I = I I i I 1 I I I I 1 1 1 1 1 1 1 

orf7ng-l IYGMGAAYKGKIRKADLRRDTPYNTYTGGGLPPTRIALPGKAAMDAAAHPSGEKYLYFVS 

220 230 240 250 260 270 



15 



310 320 330 

orf 7-1 .pep KMDGTGLSQFSHDLTEHNAAVRKYILKKX 

I I I I I II I I I I I II I I I I I I I I I I I i II 
orf 7ng-l KMDGTGLSQFSHDLTEHNAAVRKYILKKX 

280 290 



In addition, ORF7ng-l (SEP ID NO: 38) shows significant homology with a hypothetical E.coli 
protein fSEOIDNO: 1114) : 



20 



sp|P28306|YCEG_ECOLI HYPOTHETICAL 38.2 KD PROTEIN IN PABC-HOLB INTERGENIC REGION 
gi | 1787339 (AE000210) o340; 100% identical to fragment YCEG_ECOLI SW: P28306 but 
has 97 additional C-terminal residues [Escherichia coli] Length = 340 

Score = 79 (36.2 bits), Expect = 5.0e-57, Sum P(2) = 5.0e-57 

Identities = 20/87 (22%), Positives = 40/87 (45%) 



25 



30 



Query: 10 GISSVGRKLAEDRIVFSRHVLTAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPD 69 

G ++G +L D+I+ V + + GTYR +++ ++L+ + G+ 

Sbjct: 49 GRLALGEQLYADKIINRPRVFQWLLRIEPDLSHFKAGTYRFTPQMTVREMLKLLESGKEA 108 

Query: 70 SVTVQIIEGSRFSHMRKVIDATPDIGH 96 

++++ EG R S K + P I H 
Sbjct: 109 QFPLRLVEGMRLSDYLKQLREAPYIKH 135 

Score = 438 (200.7 bits), Expect = 5.0e-57, Sum P(2) = 5.0e-57 
Identities = 84/155 (54%), Positives = 111/155 (71%) 



35 



40 



Query: 120 EGQFFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWAGRQDGLPYKNPYEMLIMASLIEK 179 

EG F+PD++ A +D+ + + A+K M + ++ AW GR DGLPYK+ +++ MAS+IEK 
Sbjct: 158 EGWFWPDTWMYTANTTDVALLKI^AHKKMVKAVDSAWEGRADGLPYKDKWQLVTMAS I IEK 217 

Query: 180 ETGHEADRDHVASVFVNRLKIGMRLQTDPSVIYGMGAAYKGKIRKADLRRDTPYNTYTGG 23 9 

ET ++RD VASVF+NRL+ IGMRLQTDP+VI YGMG Y GK+ +ADL T YNTYT 
Sbjct: 218 ETAVASERDKVASVFINRLRIGMRLQTDPTVIYGMGERYNGKLSRADLETPTAYNTYTIT 277 

Query: 24 0 GLPPTRIALPGKAAMDAAAHPSGEKYLYFVSKMDG 274 

GLPP IA PG ++ AAAHP+ YLYFV+ G 
Sbjct: 278 GLPPGAIATPGADSLKAAAHPAKTPYLYFVADGKG 312 



CHIR-0160 (356.001) 



-101- 



PATENT 



Based on this analysis, including the fact that the H. influenzae YCEG protein possesses a possible 
leader sequence, it is predicted that the proteins from TV meningitidis and N. gonorrhoeae, and their 
epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 6 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 39>] (SEQ ID 
NO: 39) : 

1 CGTTTCAAAA TGTTAACTGT GTTGACGGCA ACCTTGATTG CCGGACAGGT 

51 ATCTGCCGCC GGAGGCGGTG CGGGGGATAT GAAACAGCCG AAGGAAGTCG 

101 GAAAGGTTTT CAGAAAGCAG CAGCGTTACA GCGAGGAAGA AATCAAAAAC 

151 GAACGCGCAC GGCTTGCGGC AGTGGGCGAG CGGGTTAATC AGATATTTAC 

201 GTTGCTGGGA GGGGAAACCG CCTTGCAAAA GGGGCAGGCG GGAACGGCTC 

251 TGGCAACCTA TATGCTGATG TTGGAACGCA CAAAATCCCC CGAAGTCGCC 

301 GAACGCGCCT TGGAAATGGC CGTGTCGCTG AACGCGTTTG AACAGGCGGA 

351 AATGATTTAT CAGAAATGGC GGCAGATTGA GCCTATACCG GGTAAGGCGC 

4 01 AAAAACGGGC GGGGTGGCTG CGGAACGTGC TGAGGGAAAG AGGAAATCAG 

451 CATCTGGACG GACGGGAAGA AGTGCTGGCT CAGGCGGACG AAGGACAG 

This corresponds to the amino acid sequence [<SEQ ID 40; ORF9>] (SEO ID NO: 40; ORF9) : 

1 . . RFKMLTVLTA TLIAGQVSAA GGGAGDMKQP KEVGKVFRKQ QRYSEEEIKN 

51 ERARLAAVGE RVNQIFTLLG GETALQKGQA GTALATYMLM LERTKSPEVA 

101 ERALEMAVSL NAFEQAEMIY QKWRQIEPIP GKAQKRAGWL RNVLRERGNQ 

151 HLDGREEVLA QADEGQ 

Further sequence analysis revealed the complete DNA sequence [<SEQ ID 41 >] (SEO ID NO: 41) : 



1 ATGTTACCTA ACCGTTTCAA AATGTTAACT GTGTTGACGG CAACCTTGAT 

51 TGCCGGACAG GTATCTGCCG CCGGAGGCGG TGCGGGGGAT ATGAAACAGC 

101 CGAAGGAAGT CGGAAAGGTT TTCAGAAAGC AGCAGCGTTA CAGCGAGGAA 

151 GAAATCAAAA ACGAACGCGC ACGGCTTGCG GCAGTGGGCG AGCGGGTTAA 

201 TCAGATATTT ACGTTGCTGG GAGGGGAAAC CGCCTTGCAA AAGGGGCAGG 

251 CGGGAACGGC TCTGGCAACC TATATGCTGA TGTTGGAACG CACAAAATCC 

301 CCCGAAGTCG CCGAACGCGC CTTGGAAATG GCCGTGTCGC TGAACGCGTT 

3 51 TGAACAGGCG GAAATGATTT ATCAGAAATG GCGGCAGATT GAGCCTATAC 

4 01 CGGGTAAGGC GCAAAAACGG GCGGGGTGGC TGCGGAACGT GCTGAGGGAA 
4 51 AGAGGAAATC AGCATCTGGA CGGACTGGAA GAAGTGCTGG CTCAGGCGGA 
501 CGAAGGACAG AACCGCAGGG TGTTTTTATT GTTGGCACAA GCCGCCGTGC 
551 AACAGGACGG GTTGGCGCAA AAAGCATCGA AAGCGGTTCG CCGCGCGGCG 
601 TTGAAATATG AACATCTGCC CGAAGCGGCG GTTGCCGATG TGGTGTTCAG 
651 CGTACAGGGA CGCGAAAAGG AAAAGGCAAT CGGAGCTTTG CAGCGTTTGG 
701 CGAAGCTCGA TACGGAAATA TTGCCCCCCA CTTTAATGAC GTTGCGTCTG 
751 ACTGCACGCA AATATCCCGA AATACTCGAC GGCTTTTTCG AGCAGACAGA 
801 CACCCAAAAC CTTTCGGCCG TCTGGCAGGA AATGGAAATT ATGAATCTGG 
851 TTTCCCTGCA CAGGCTGGAT GATGCCTATG CGCGTTTGAA CGTGCTGTTG 
901 GAACGCAATC CGAATGCAGA CCTGTATATT CAGGCAGCGA TATTGGCGGC 
951 AAACCGAAAA GAAGGTGCTT CCGTTATCGA CGGCTACGCC GAAAAGGCAT 

1001 ACGGCAGGGG GACGGAGGAA CAGCGGAGCA GGGCGGCGCT AACGGCGGCG 

1051 ATGATGTATG CCGACCGCAG GGATTACGCC AAAGTCAGGC AGTGGCTGAA 

1101 AAAAGTATCC GCGCCGGAAT ACCTGTTCGA CAAAGGTGTG CTGGCGGCTG 
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1151 CGGCGGCTGT CGAGTTGGAC GGCGGCAGGG CGGCTTTGCG GCAGATCGGC 

1201 AGGGTGCGGA AACTTCCCGA ACAGCAGGGG CGGTATTTTA CGGCAGACAA 

1251 TTTGTCCAAA ATACAGATGC TCGCCCTGTC GAAGCTGCCC GATAAACGGG 

1301 AGGCTTTGAG GGGGTTGGAC AAGATTATCG AAAAACCGCC TGCCGGCAGT 

1351 AATACAGAGT TACAGGCAGA GGCATTGGTA CAGCGGTCAG TTGTTTACGA 

1401 TCGGCTTGGC AAGCGGAAAA AAATGATTTC AGATCTTGAA AGGGCGTTCA 

1451 GGCTTGCACC CGATAACGCT CAGATTATGA ATAATCTGGG CTACAGCCTG 

1501 CTGACCGATT CCAAACGTTT GGACGAAGGT TTCGCCCTGC TTCAGACGGC 

1551 ATACCAAATC AACCCGGACG ATACCGCTGT CAACGACAGC ATAGGCTGGG 

1601 CGTATTACCT GAAAGGCGAC GCGGAAAGCG CGCTGCCGTA TCTGCGGTAT 

1651 TCGTTTGAAA ACGACCCCGA GCCCGAAGTT GCCGCCCATT TGGGCGAAGT 

1701 GTTGTGGGCA TTGGGCGAAC GCGATCAGGC GGTTGACGTA TGGACGCAGG 

1751 CGGCACACCT TACGGGAGAC AAGAAAATAT GGCGGGAAAC GCTCAAACGT 

1801 CACGGCATCG CATTGCCCCA ACCTTCCCGA AAACCTCGGA AATAA 



This corresponds to the amino acid sequence [<SEQ ID 42; ORF9-l>] (SEP ID NO: 42;PRF9-1) : 



1 MLPNRFKMLT VLTATLIAGQ VSAAGG GAGD MKQPKEVGKV FRKQQRYSEE 

51 EIKNERARLA AVGERVNQIF TLLGGETALQ KGQAGTALAT YMLMLERTKS 

101 PEVAERALEM AVSLNAFEQA EMIYQKWRQI EPIPGKAQKR AGWLRNVLRE 

151 RGNQHLDGLE EVLAQADEGQ NRRVFLLLAQ AAVQQDGLAQ KASKAVRRAA 

201 LKYEHLPEAA VADWFSVQG REKEKAIGAL QRLAKLDTEI LPPTLMTLRL 

251 TARKYPEILD GFFEQTDTQN LSAVWQEMEI MNLVSLHRLD DAYARLNVLL 

301 ERNPNADLYI QAAILAANRK EGASVIDGYA EKAYGRGTEE QRSRAALTAA 

351 MMYADRRDYA KVRQWLKKVS APEYLFDKGV LAAAAAVELD GGRAALRQIG 

401 RVRKLPEQQG RYFTADNLSK IQMLALSKLP DKREALRGLD KIIEKPPAGS 

451 NTELQAEALV QRSWYDRLG KRKKMISDLE RAFRLAPDNA QIMNNLGYSL 

501 LTDSKRLDEG FALLQTAYQI NPDDTAVNDS IGWAYYLKGD AESALPYLRY 

551 SFENDPEPEV AAHLGEVLWA LGERDQAVDV WTQAAHLTGD KKIWRETLKR 

601 HGIALPQPSR KPRK* 



Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N.meninsitidis (strain A) 

ORP9 (SEP ID NO: 40) shows 89.8% identity over a 166aa overlap with an ORF (ORF9a) (SEP 
ID NP: 44) from strain A of N. meningitidis: 



orf 9 .pep 



10 20 30 40 50 

RFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERARLA 



orf 9a 




10 20 30 40 50 



orf 9 .pep 



60 70 80 90 100 110 

AVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA 



orf 9a 




60 70 80 90 100 110 



orf 9 .pep 



120 130 140 150 160 

EM I YQKWRQ I E P I PGKAQKRAGWLRNVLRERGNQHLDGRE EVLAQADEGQ 



orf 9a 




120 130 140 150 160 170 
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orf9a AAVQQDGLAQKASKAVRRAALRYEHLPEAAVADWFSVQXREKEKAIGALQRLAKLDTEI 
180 190 200 210 220 230 

The complete length ORF9a nucleotide sequence [<SEQ ID 43>] (SEP ID NO: 43) is: 

1 ATGTTACCCG CCCGTTTCAC CATTTTATCT GTGCTCGCGG CAGCCCTGCT 

51 TGCCGGGCAG GCGTATGCCG CCGGCGCGGC GGATGCGAAG CCGCCGAAGG 

101 AAGTCGGAAA GGTTTTCAGA AAGCAGCAGC GTTACAGCGA GGAAGAAATC 

151 AAAAACGAAC GCGCACGGCT TGCGGCAGTG GGCGAGCGGG TTAATCAGAT 

201 ATTTACGTTG CTGGGANGGG AAACCGCCTT GCAAAAGGGG CAGGCGGGAA 

251 CGGCTCTGGC AACCTATATG CTGATGTTGG AACGCACAAA ATCCCCCGAA 

301 GTCGCCGAAC GCGCCTTGGA AATGGCCGTG TCNCTGAACG CGTTTGAACA 

351 GGCGGAAATG ATTTATCAGA AATGGCGGCA GATTGAGCCT ATACCGGGTA 

4 01 AGGCGCAAAA ACGGGCGGGG TGGCTGCGGA ACGTGCTGAG GGAAAGAGGA 

4 51 AATCAGCATC TAGACGGACT GGAAGAANTG CTGGCTCAGG CGGACGAANG 

501 ACAGAACCGC AGGGTGTTTT TATTGTTGGC ACAAGCCGCC GTGCAACAGG 

551 ACGGGTTGGC GCAAAAAGCA TCGAAAGCGG TTCGCCGCGC GGCGTTGAGA 

601 TATGAACATC TGCCCGAAGC GGCGGTTGCC GATGTGGTGT TCAGCGTACA 

651 GGNACGCGAA AAGGAAAAGG CAATCGGAGC TTTGCAGCGT TTGGCGAAGC 

701 TCGATACGGA AATATTGCCC CCCACTTTAA TGACGTTGCG TCTGACTGCA 

751 CGCAAATATC CCGAAATACT CGACGGCTTT TTCGAGCAGA CAGACACCCA 

801 AAACCTTTCG GCCGTCTGGC AGGAAATGGA AATTATGAAT CTGGTTTCCC 

851 TGCACAGGCT GGATGATGCC TATGCGCGTT TGAACGTGCT GTTGGAACGC 

901 AATCCGAATG CAGACCTGTA TATTCAGGCA GCGATATTGG CGGCAAACCG 

951 AAAAGAANGT GCTTCCGTTA TCGACGGCTA CGCCGAAAAG GCATACGGCA 

1001 GGGGGACGGG GGAACAGCGG GGCAGGGCGG CAATGACGGC GGCGATGATA 

1051 TATGCCGACC GAAGGGATTA CACCAAAGTC AGGCAGTGGT TGAAAAAAGT 

1101 GTCCGCGCCG GAATACCTGT TCGACAAAGG TGTGCTGGCG GCTGCGGCGG 

1151 CTGTCGAGTT GGACNGCGGC AGGGCGGCTT TGCGGCAGAT CGGCAGGGTG 

1201 CGGAAACTTC CCGAACAGCA GGGGCGGTAT TTTACGGCAG ACAATTTGTC 

1251 CAAAATACAG ATGTTCGCCC TGTCGAAGCT GCCCGACAAA CGGGAGGCTT 

1301 TGAGGGGGTT GGACAAGATT ATCGAAAAAC CGCCTGCCGG CAGTAATACA 

13 51 GAGTTACAGG CAGAGGCATT GGTACAGCGG TCAGTTGTTT ACGATCGGCT 

14 01 TGGCAAGCGG AAAAAAATGA TTTCAGATCT TGAAAGGGCG TTCAGGCTTG 
14 51 CACCCGATAA CGCTCAGATT ATGAATAATC TGGGCTACAG CCTGCTTTCC 
1501 GATTCCAAAC GTTTGGACGA AGGCTTCGCC CTGCTTCAGA CGGCATACCA 
1551 AATCAACCCG GACGATACCG CTGTCAACGA CAGCATAGGC TGGGCGTATT 
1601 ACCTGAAANG CGACGCGGAA AGCGCGCTGC CGTATCTGCG GTATTCGTTT 
1651 GAAAACGACC CCGAGCCCGA AGTTGCCGCC CATTTGGGCG AAGTGTTGTG 
1701 GGCATTGGGC GAACGCGATC AGGCGGTTGA CGTATGGACG CAGGCGGCAC 
1751 ACCTTACGGG AGACAAGAAA ATATGGCGGG AAACGCTCAA ACGTCACGGC 
1801 ATCGCATTGC CCCAACCTTC CCGAAAACCT CGGAAATAA 

This encodes a protein having amino acid sequence [<SEQ ID 44>] (SEP ID NO: 44) : 



1 MLPARFTILS VLAAALLAGQ AYAAGAA DAK PPKEVGKVFR KQQRYSEEEI 

51 KNERARLAAV GERVNQIFTL LGXETALQKG QAGTALATYM LMLERTKSPE 

101 VAERALEMAV SLNAFEQAEM IYQKWRQIEP IPGKAQKRAG WLRNVLRERG 

151 NQHLDGLEEX LAQADEXQNR RVFLLLAQAA VQQDGLAQKA SKAVRRAALR 

201 YEHLPEAAVA DWFSVQXRE KEKAIGALQR LAKLDTEILP PTLMTLRLTA 

251 RKYPEILDGF FEQTDTQNLS AVWQEMEIMN LVSLHRLDDA YARLNVLLER 

301 NPNADLYIQA AILAANRKEX ASVIDGYAEK AYGRGTGEQR GRAAMTAAMI 

351 YADRRDYTKV RQWLKKVSAP EYLFDKGVLA AAAAVELDXG RAALRQIGRV 

4 01 RKLPEQQGRY FTADNLSKIQ MFALSKLPDK REALRGLDKI IEKPPAGSNT 

451 ELQAEALVQR SWYDRLGKR KKMISDLERA FRLAPDNAQI MNNLGYSLLS 

501 DSKRLDEGFA LLQTAYQINP DDTAVNDSIG WAYYLKXDAE SALPYLRYSF 

551 ENDPEPEVAA HLGEVLWALG ERDQAVDVWT QAAHLTGDKK IWRETLKRHG 

601 IALPQPSRKP RK* 
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ORF9a (SEP ID NO: 44) and ORF9-1 (SEP ID NO: 42) show 95.3% identity in 614 aa overlap: 



10 20 30 40 50 

orf 9a . pep MLPARFT I LS VLAAALLAGQAYAAG - -AADAKPPKEVGKVFRKQQRYSEEEIKNERARLA 



5 orf 9 - 1 MLPNRFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERARLA 

10 20 30 40 50 60 



60 70 80 90 100 110 

or f 9a . pep AVGERVNQIFTLLGXETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA 

IIIMIIIIIIIII 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M U , M I h 1 1 1 : 1 1 1 M 

10 orf 9-1 AVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA 

70 80 90 100 110 120 



120 130 140 150 160 170 

orf 9a . pep EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEXLAQADEXQNRRVFLLLAQ 

1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 II llllll II III I MM I 

15 ■ orf9-l EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEVLAQADEGQNRRVFLLLAQ 

130 140 150 160 170 180 



180 190 200 210 220 230 

or f 9a . pep AAVQQDGLAQKASKAVRRAALRYEHLPEAAVADWFSVQXREKEKAIGALQRLAKLDTEI 

1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 : 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 

20 orf 9-1 AAVQQDGLAQKASKAVRRAALKYEHLPEAAVADWFSVQGREKEKAIGALQRLAKLDTEI 

190 200 210 220 230 240 



240 250 260 270 280 290 

orf 9a . pep LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLHRLDDAYARLNVLL 

1 1 1 1 1 1 1 1 1 1 E 1 1 1 1 1 L 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 i 1 1 1 1 [ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i t 

25 orf 9 - 1 LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLHRLDDAYARLNVLL 

250 260 270 280 290 300 



300 310 320 330 340 350 

orf 9a . pep ERNPNADLYIQAAILAANRKEXASVIDGYAEKAYGRGTGEQRGRAAMTAAMI YADRRDYT 

II II III III II I I I Mill I 1 1 I 1 I I I I I I I I I I I I I I * I I I = I I I I = I I I I I I I = 
30 orf 9-1 ERNPNADLYIQAAILAANRKEGASVIDGYAEKAYGRGTEEQRSRAALTAAMMYADRRDYA 

310 320 330 340 350 360 



360 370 380 390 400 410 

orf 9a . pep KTOQWLKKVSAPEYLFDKGVLAAAAAVELDXGRAALRQIGRVRKLPEQQGRYFTADNLSK 

1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M II II II II 1 1 1 M I M M 1 1 M I M 1 1 M 

35 orf 9-1 KVRQWLKKVSAPEYLFDKGVLiAAAAAVELDGGRAALRQIGRVRKLPEQQGRYFTADNLSK 

370 380 390 400 410 420 



420 430 440 450 460 470 

orf 9a. pep IQMFALSKLPDKREALRGLDKIIEKPPAGSNTELQAEALVQRSWYDRLGKRKKMISDLE 

I I I : I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
40 or f 9 - 1 IQMLALSKLPDKREALRGLDKI IEKPPAGSNTELQAEALVQRSWYDRLGKRKKMISDLE 

430 440 450 460 470 480 



480 490 500 510 520 530 

orf 9a. pep* RAFRLAPDNAQIMNNLGYSLLSDSKRLDEGFALLQTAYQINPDDTAVNDSIGWAYYLKXD 

I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I E I I I I I I I I I I I I I I I I I I I I I I 
45 or f 9 - 1 RAFRLAPDNAQIMNNLGYSLLTDSKRLDEGFALLQTAYQINPDDTAVNDS IGWAYYLKGD 

490 500 510 520 530 540 



orf 9a . pep 



540 550 560 570 580 590 

AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLTGDKKIWRETLKR 
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1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 

or f 9-1 AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLTGDKKIWRETLKR 

550 560 570 580 590 600 

600 610 
5 orf 9a . pep HGIALPQPSRKPRKX 

I ! I I ! 1 I I I I I I I I 
orf 9-1 HGIALPQPSRKPRKX 

610 

Homology with a predicted ORF from N gonorrhoeae 

10 PRF9 (SEP ID NO: 40) shows 82.8% identity over a 163aa overlap with a predicted ORF 
(ORF9.ng) (SEP ID NO: 46) from N. gonorrhoeae: 





Orf 9 


RFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERAR 

II :|:||:|:|:|||: 1 ||:|:: I I I I I 1 | = 1 1 = = 1 1 1 1 1 1 1 1 1 1 1 II 
M IMLPARFT I LS VLAAALLAGQAYAA- - GAADVELPKEVGKVLRKHRRYS EEE I KNERAR 


54 




orf 9ng 


58 


15 


orf 9 


LAAVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFE 

Ml 1 M M 1 1 1 1 II 1 i 1 M 1 M 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 M 1 1 li 1 1 

LAAVGERVNRVFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFE 


114 




orf 9ng 


118 




orf9 


QAEM I YQKWRQ I E P I PGKAQKRAGWLRNVLRERGNQHLDGRE EVLAQADEGQ 


166 


20 




Illlllllllllllllhlll IIMIIIhl II IN III Ihl 




orf 9ng 


QAEM I YQKWRQ I EP I PGEAQKPAGWLRNVLKEGGNPHLDRLEEVPAQSDYVHQPM I FLLL 


178 



The PRF9ng nucleotide sequence [<SEQ ID 45>] (SEP ID NP: 45) was predicted to encode a 
protein having including acid sequence [<SEQ ID 46>] (SEP ID NP: 46) : 



1 MIMLPARFTI LSVLAAALLA GQAYAAGAA D VELPKEVGKV LRKHRRYSEE 

25 51 EIKNERARLA AVGERVNRVF TLLGGETALQ KGQAGTALAT YMLMLERTKS 

101 PEVAERALEM AVSLNAFEQA EM I YQKWRQ I EPIPGEAQKP AGWLRNVLKE 

151 GGNPHLDRLE EVPAQSDYVH QP MIFLLLVQ AAVQHGGVA Q KPSKAVRPAA 

201 YNYEVLPETA GADAVFCVQG PQYEKAIQSF PPCGRNPQTE NIAPPFNELF 

251 RPTARPISPK LLQRFFRTEP NLAKPFRPPG PEMETYQTGF PRPLTRNNPT 



30 



Amino acids 1-28 are a putative leader sequence, and 173-189 are predicted to be a transmembrane 
domain. 



Further sequence analysis revealed the complete length PRF9ng DNA sequence [<SEQ ID 47>] 
(SEP ID NP: 47) : 

35 1 ATGTTACCCG CCCGTTTCAC TATTTTATCT GTCCTCGCAG CAGCCCTGCT 

51 TGCCGGACAG GCGTATGCTG CCGGCGCGGC GGATGTGGAG CTGCCGAAGG 
101 AAGTCGGAAA GGTTTTAAGG AAACATCGGC GTTACAGCGA GGAAGAAATC 
151 AAAAACGAAC GCGCACGGCT TGCGGCAGTG GGCGAACGGG TCAACAGGGT 
201 GTTTACGCTG TTGGGCGGTG AAACGGCTTT GCAGAAAGGG CAGGCGGGAA 

40 251 CGGCTCTGGC AACCTATATG CTGATGTTGG AACGCACAAA ATCCCCCGAA 

301 GTCGCCGAAC GCGCCTTGGA AATGGCCGTG TCGCTGAACG CGTTTGAACA 



CHIR-0160 (356.001) 



-106- 



PATENT 



3 51 GGCGGAAATG ATTTATCAGA AATGgcggca gatcgagcct ataCcgggtg 

4 01 aggcgcaaaa accgGcgggG tggctgcgga acgtattgaa ggaagggGGa 
4 51 aaTCAGCATC TGGAcgggtt gaaagaggTG CtggcgcaAT cggacgatGT 
501 GCAAAAAcgc aggaTATTTT TGCTGCTGGT GCAAGCCGCC GTGCagcagg 

5 551 gTGGGGTGGC TCAAAAAGCA TCGAAAGCGG TTCGCcgtgc GGcgttgaAG 

601 TATGAACATC TGCCcgaagc ggcggTTGCC GATGcggTGT TCGGCGTACA 

651 GGGACGCGAA AAGGAAAagg caaTCGAAGC TTTGCAGCGT TTGGCGAAGC 

701 TCGATACGGA AATATTGCCC CCCACTTTAA TGACGTTGCG TCTGACTGCA 

751 CGCAAATATC CCGAAATACT CGACGGCTTT TTCGAGCAGA CAGACACCCA 

10 801 AAACCTTTCG GCCGTCTGGC AGGAAATGGA AATTATGAAT CTGGTTTCCC 

851 TGCGTAAGCC GGATGATGCC TATGCGCGTT TGAACGTGCT GTTGGAACAC 

901 AACCCGAATG CAAACCTGTA TATTCAGGCG GCGATATTGG CGGCAAACCG 

951 AAAAGAAGGT GCGTCCGTTA TCGACGGCTA CGCCGAAAAG GCATACGGCA 

1001 GGGGGACGGG GGAACAGCGG GGCagggcgg cAATgacggc GGCGATGATA 

]5 1051 TATGCCGACC GCAGGGATTA CGCCAAAGTC AGGCAGTGGT TGAAAAAAGT 

1101 GTCCGCGCCG GAATACCTGT TCGACAAAGG CGTGCTGGCG GCTGCGGCGG 

1151 CTGCCGAATT GGACGGAGGC CGGGCGGCTT TGCGGCAGAT CGGCAGGGTG 

1201 CGGAAACTTC CCGAACAGCA GGGGCGGTAT TTTACGGCAG ACAATTTGTC 

1251 CAAAATACAG ATGCTCGCCC TGTCGAAGCT GCCCGACAAA CGGGAAGCCC 

20 1301 TGATCGGGCT GAACAACATC ATCGCCAAAC TTTCGGCGGC GGGAAGCACG 

1351 GAACCTTTGG CGGAAGCATT GGCACAGCGT TCCATTATTT ACGaacAGTT 

14 01 cggCAAACGG GGAAAAATGA TTGCCGACCT tgaAACcgcg CTCAAACTTA 

14 51 CGCCCGATAA TGCACAAATT ATGAATAATC TGGGCTACAG CCTGCTTTCC 

1501 GATTCCAAAC GTTTGGACGA GGGTTTCGCC CTGCTTCAGA CGGCATACCA 

25 1551 AATCAACCCG GACGATACCG CCGTTAACGA CAGCATAGGC TGGGCGTATT 

1601 ACCTGAAAGG CGACgcggaA AGCGCGCTGC CGTATCTGcg gtattcgttt 

1651 gAAAACGACC CCGAGCCCGA AGTTGCCGCC CATTTGGGCG AAGTGTTGTG 

1701 GGCATTGGGC GAACGCGATC AGGCGGTTGA CGTATGGACG CAGGCGGCAC 

1751 ACCTTAGGGG AGACAAGAAA ATATGGCGGG AGACGCTCAA ACGCTACGGA 

30 1801 ATCGCCTTGC CCGAGCCTTC CCGAAAACCC CGGAAATAA 

This encodes a protein having amino acid sequence [<SEQ ID 48>] (SEP ID NO: 48) : 

1 MLPARFTILS VLAAALLAGQ AYAAGAA DVE LPKEVGKVLR KHRRYSEEEI 

51 KNERARLAAV GERVNRVFTL LGGETALQKG QAGTALATYM LMLERTKSPE 

35 101 VAERALEMAV SLNAFEQAEM IYQKWRQIEP IPGEAQKPAG WLRNVLKEGG 

151 NQHLDGLKEV LAQSDDVQKR RIFLLLVQAA VQQGGVAQKA SKAVRRAALK 

201 YEHLPEAAVA DAVFGVQGRE KEKAIEALQR LAKLDTEILP PTLMTLRLTA 

251 RKYPEILDGF FEQTDTQNLS AVWQEMEIMN LVSLRKPDDA YARLNVLLEH 

301 NPNANLYIQA AILAANRKEG ASVIDGYAEK AYGRGTGEQR GRAAMTAAMI 

40 3 51 YADRRDYAKV RQWLKKVSAP EYLFDKGVLA AAAAAELDGG RAALRQIGRV 

4 01 RKLPEQQGRY FTADNLSKIQ MLALSKLPDK REALIGLNNI IAKLSAAGST 

451 EPLAEALAQR SIIYEQFGKR GKMIADLETA LKLTPDNAQI MNNLGYSLLS 

501 DSKRLDEGFA LLQTAYQINP DDTAVNDSIG WAYYLKGDAE SALPYLRYSF 

551 ENDPEPEVAA HLGEVLWALG ERDQAVDVWT QAAHLRGDKK IWRETLKRYG 

45 601 IALPEPSRKP RK* 

ORF9ng (SEP ID NO: 48) and ORF9-1 (SEP ID NO: 42) show 88.1% identity in 614 aa overlap: 

10 20 30 40 50 60 

orf 9-1 .pep MLPNRFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERARLA 
50 Ml || :| : ||:|:|:|||: ||| | : | : : | | | I II I : I I - I I I I I I I I I I I I I I I 

orf 9ng- 1 MLPARFT I LS VLAAALLAGQAYAAG - - AADVELPKEVGKVLRKHRRYSEEE I KNERARLA 

10 20 30 40 50 

70 80 90 100 110 120 

orf 9 - 1 . pep AVGERVNQI FTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA 
55 | | | | | | | : : | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | M 

orf 9ng- 1 AVGERVNRVFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA 
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60 70 80 90 100 110 

130 140 150 160 170 180 

orf 9-1. pep EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEVLAQADEGQNRRVFLLLAQ 

I I I M I I I I M I M I I MINN H I I I I I I I I : I I I I I H : hlhlllhl 
5 orf 9ng- 1 EM I YQKWRQ I E P I PGEAQKPAGWLRNVLKEGGNQHLDGLKE VLAQSDDVQKRRI FLLLVQ 

120 130 140 150 160 170 

190 200 210 220 230 240 

or f 9 - 1 . pep AAVQQDGLAQKASKAVRRAALKYEHLPEAAVADWFSVQGREKEKAIGALQRLAKLDTEI 
Mill h I I I I I I I I I I I I I I I I I I I II II hi h I I I h I I I I I IIIIIIIIIIM 
1 0 orf 9ng- 1 AAVQQGGVAQKAS KAVRRAALKYEHLPEAAVADAVFGVQGREKEKAI EALQRLAKLDTE I 

180 190 200 210 220 230 

250 260 270 280 290 300 

orf 9-1 .pep LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLHRLDDAYARLNVLL 

MINI! IIIIIMIIIIIMI lllllll MM INI MINIMA 1 1 1 1 1 1 [ 1 1 1 1 

1 5 orf 9ng- 1 LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLRKPDDAYARLNVLL 

240 250 260 270 280 290 

310 320 330 340 350 360 

orf 9-1 .pep ERNPNADLY I QAA I LAANRKEGAS VI DGYAEKAYGRGTEEQRSRAALTAAMMYADRRDY A 

h I I I I :| I I ] I I I I I II I I I I I I I I I I I M I I I I I I I ' h I I hi I I h I I I I I I I I 
20 orf 9ng- 1 EHNPNANLY I QAA I LAANRKEGAS VI DGYAEKAYGRGTGEQRGRAAMTAAM I YADRRDYA 

300 310 320 330 340 350 

370 380 390 400 410 420 

or f 9 - 1 . pep KVRQWLKKVSAPEYLFDKGVLAAAAAVELDGGRAALRQIGRVRKLPEQQGRYFTADNLSK 
h I I I I I h I I h h I I II I I I I I h I I I I I I I I I I h I h II I I I I II h h I I I 
25 orf 9ng- 1 KVRQWLKKVSAPEYLFDKGVLAAAAAAELDGGRAALRQIGRVRKLPEQQGRYFTADNLSK 

360 370 380 390 400 410 

430 440 450 460 470 480 

orf 9 - 1 . pep IQMLALSKLPDKREALRGLDKI IEKPPAGSNTELQAEALVQRSWYDRLGKRKKMISDLE 

I h I I I I I II I I I I I Ihh I |:::|| I I I I = I I I : = I : : = I I I llhlll 
30 orf 9ng-l IQMKALSKLPDKREALIGLNNIIAKLSAAGSTEPLAEALAQRSIIYEQFGKRGKMIADLE 

420 430 440 450 460 470 

490 500 510 520 530 540 

orf9-l.pep RAFRLAPDNAQ I MNNLGYS LLTDS KRLDEGFALLQTAYQ I NPDDTAVNDS I GWAY YLKGD 

h h h I I I II I I I I I I I I h I I I I II I I I h I I I I I I II I I I I I I I I I I h I I I h - 
35 orf 9ng- 1 TALKLTPDNAQ I MNNLGYS LLSDS KRLDEGFALLQTAYQ I NPDDTAVNDS I GWAY YLKGD 

480 490 500 510 520 530 

550 560 570 580 590 600 

or f 9 - 1 . pep AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLTGDKKIWRETLKR 

llllllllllllhlllllllllll llllhllllll llhhl I I I : 1 I I I t I I I 
40 orf 9ng- 1 AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLRGDKKIWRETLKR 

540 550 560 570 580 590 

610 

orf 9-1 .pep HGIALPQPSRKPRKX 
hlllhllllhll 
45 orf9ng-l YGIALPEPSRKPRKX 

600 610 

In addition, ORF9ng (SEP ID NO: 48) shows significant homology with a hypothetical protein 
(SEP ID NO: 1115) from P .aeruginosa: 
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sp|P42810|YHE3_PSEAE HYPOTHETICAL 64.8 KD PROTEIN IN HEMM-HEMA INTERGENIC REGION 
(ORF3) 

)gi | 1072999 |pir | |S49376 hypothetical protein 3 - Pseudomonas aeruginosa )gi|557259 
(X82071) orf3 [Pseudomonas aeruginosa] Length = 576 
5 Score = 128 bits (318), Expect = le-28 

Identities = 138/587 (23%), Positives = 228/587 (38%), Gaps = 125/587 (21%) 

Query: 67 VFTLLGGETALQKGQAGTALATYMLMLERTKS PE VAERALEMAVSLNAFEQAEM I YQKWR 126 

+ ++LL E A Q+ + AL+ Y++ ++T+ P V+ERA +A LA ++A W 
Sbjct : 53 LYSLLVAELAGQRNRFDIALSNYWQAQKTRDPGVSERAFRIAEYLGADQEALDTSLLWA 112 

10 Query: 127 QIEPIPGEAQKPAG WLRNVLKEGGNQHLDGLKEVLAQSDDVQKRRI 172 

+ P +AQ+ A ++ VL G+ H D L A++D + + 

Sbjct: 113 RSAPDNLDAQRAAAIQLARAGRYEESMVYMEKVLNGQGDTHFDFLALSAAETDPDTRAGL 172 

Query: 173 FXXXXXXXXXXXXXXXKASKAVRRAALKYEHLPEAAVADAVFGVQGREKEKAIEALQRLA 232 

+ + KY + + A+ Q + +A+ L+ + 

15 Sbjct: 173 L QSFDHLLKKYPNNGQLLFGKALLLQQDGRPDEALTLLEDNS 214 

Query: 233 KLDTE I LP PTLMTLRLTARK YPE I LDGFFEQTDTQNLSAVWQEME IMNLVSLRKP 287 

E+PL+L + K P+GED + + + + LV + 
Sbjct: 215 ASRHEVAPLLLRSRLLQSMKRSDEALPLLKAGIKEHPDDKRVRLAYARL LVEQNRL 270 

Query: 288 DDAYARLNVLLEHNPN -ANLYIQAAI 312 

20 DDA A L++ P+ A +Y++ + 

Sbjct: 271 DDAKAEFAGLVQQFPDDDDDLRFSLALVCLEAQAWDEARIYLEELVERDSHVDAAHFNLG 330 

Query: 313 -LAANRKEGASVIDGYAEKAYGRGTGEQRGRAAMTAAMIYADRRDYAKVRQWLKKVSAPE 371 

LA +K+ A +D YA+ GG + T++ARDAR + P+ 

Sbjct: 331 RLAEEQKDTARALDEYAQ - - VGPGNDFLPAQLRQTDVLLKAGRVDEAAQRLDKARSEQPD 388 

25 Query: 372 YLFDKXXXXXXXXXXXXXXXXXXRQ IGRVRKLPEQQGRYFTADNLS KI QMLALS KLPDKR 431 

Y A L 1+ ALS + 

Sbjct: 3 89 Y AI QLYL I EAEALSNNDQQE 4 08 

Query: 432 EALIGLNNIIAKLSAAGSTEPLAEALAQRSIIYEQFGKRGKMIADLETALKLTPDNAQIM 491 
+A + + + ELL RS + + E+ +M DL + PDNA + 

30 Sbjct: 409 KAWQAIQEGLKQYP EDL-NLLYTRSMLAEKRNDLAQMEKDLRFVIAREPDNAMAL 462 

Query: 4 92 NNLGYS LLSDS KRLDEGFALLQTAYQ INPDDTAVNDS I GWAYYLKGDAES ALPYLRYS FE 551 

N LGY+L + R E L+ A+ + +NPDD A+ DS+GW Y +G A YLR + + 
Sbjct: 463 NALGYTLADRTTRYGEARELILKAHKLNPDDPAILDSMGWINYRQGKLADAERYLRQALQ 522 

Query: 552 NDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLRGDKKIWRETLKR 598 
35 P+ EVAAHLGEVLWA G+A+W+ +D+R T+KR 

Sbjct: 523 RYPDHEVAAHLGEVLWAQGRQGDARAIWREYLDKQPDSDVLRRTIKR 569 

gi | 2983399 (AE000710) hypothetical protein (SEP ID NO: 1116) [Aquifex aeolicus] 
Length =54 5 
Score = 81.5 bits (198), Expect = le-14 
40 Identities = 61/198 (30%), Positives = 98/198 (48%), Gaps = 19/198 (9%) 

Query: 408 GRYFTADNL- SKIQMLALSKLPDKREALIGLNNI IAKLSAAGSTEPLAEALAQ 459 

G Y A L K ++LA PDK+E L + +K + + L + 

Sbjct: 335 GNYEDAKRLIEKAKVLA PDKKE I LFLEADYYSKTKQYDKALEI LKKLEKDYPNDSR 390 

Query: 460 RS 1 1 YEQFGKRGKMI ADLETALKLTPDNAQIMNNLGYSLLS - - DS KRLDEGFALLQ 513 

45 +I+Y+ G L A++L P+N N LGYSLL +R++E L+ + 
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Sbjct : 


391 


VYFMEAIVYDNLGDIKNAEKALRKAIELDPENPDYYNYLGYSLLLWYGKERVEEAEELIK 


450 


Query: 


514 


TAYQINPDDTAVNDS IGWAYYLKGDAESALPYLRYSF- ENDPEPEVAAHLGEVLW ALGER 


572 






A + +P++ A DS+GW YYLKGD E A+ YL + E +P V H+G+VL +G + 




Sbjct: 


451 


KALEKDPENPAYIDSMGWVYYLKGDYERAMQYLLKALREAYDDPVVNEHVGDVLLKMGYK 


510 


Query : 


573 


DQAVDVWTQAAHLRGDKK 590 








++A + + +A L + K 




Sbjct: 


511 


EEARNYYERALKLLEEGK 52 8 





Based on this analysis, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 7 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 49>] (SEP ID 
NO: 49) : 



1 AACCTCTACG CCGGCCCGCA GACCACATCC GTCATCGCAA ACATCGCCGA 

51 CAACCTGCAA CTGGCCAAAG ACTACGGCAA AGTACACTGG TTCGCCTCCC . 

101 CGCTCTTCTG GCTCCTGAAC CAACTGCACA ACATCATCGG CAACTGGGGC 

151 TGGGCGATTA TCGTTTTAAC CATCATCGTC AAAGCCGTAC TGTATCCATT x 

201 GACCAACGCC TCTTACCGCT CTATGGCGAA AATGCGTGCC GCCGCACCCA 

251 AACTGCAAGC CATCAAAGAG AAATACGGCG ACGACCGTAT GGCGCAACAA 

301 CAGGCGATGA TGCAGCTTTA CACAGACGAG AAAATCAACC CG^CTGGGCG 

351 GCTGCCTGCC TATGCTGTTG CAAATCCCCG TCTTCATCGG ATTGTATTGG 

4 01 GCATTGTTCG CCTCCGTAGA ATTGCGCCAG GCACCTTGGC TGGGTTGGAT 

4 51 TACCGACCTC AGCCGCGCCG ACCCCTACTA CATCCTGCCC ATCATTATGG 

501 CGGCAACGAT GTTCGCCCAA ACTTATCTGA ACCCGCCGCC GAcCGACCCG 

551 ATGCagGCGA AAATGATGAA AATCATGCCG TTGGTTTTCT CsGwCrTGTT 

601 CTTCTTCTTC CCTGCCGGks TGGTATTGTA CTGGGTAGTC AACAACCTCC 

651 TGACCATCGC CCAGCAATGG CACATCAACC GCAGCATCGA AAAACAACGC 

701 GCCCAAGGCG AAGTCGTTTC CTAA 

This corresponds to the amino acid sequence [<SEQ ID 50; ORF1 1>] (SEP ID NO: 50; PRF11) : 



1 . . NLYAGPQTTS VIANIADNLQ LAKDYGKVHW FASPLFWLLN QLHNIIGNWG 

51 W AIIVLTIIV KAVLYPLT NA SYRSMAKMRA AAPKLQAIKE KYGDDRMAQQ 

101 QAMMQLYTDE KINPLGGCLP MLLQIPVFIG LYWALFA SVE LRQAPWLGWI 

151 TDLSRADPYY ILPIIMAATM FAQTYLNPPP TDPMQAKMMK IMP LVFSXXF 

201 FFFPAGXVLY WWNNLLTIA QQWHINRSIE KQRAQGEWS * 

Further sequence analysis revealed the complete DNA sequence [<SEQ ID 51>] (SEP ID NO: 51) : 



1 ATGGATTTTA AAAGACTCAC GGCGTTTTTC GCCATCGCGC TGGTGATTAT 

51 GATCGGCTGG GAAAAGATGT TCCCCACTCC GAAGCCAGTC CCCGCGCCCC 

101 AACAGGCAGC ACAACAACAG GCCGTAACCG CTTCCGCCGA AGCCGCGCTC 

151 GCGCCCGCAA CGCCGATTAC CGTAACGACC GACACGGTTC AAGCCGTCAT 

201 TGATGAAAAA AGCGGCGACC TGCGCCGGCT GACCCTGCTC AAATACAAAG 

251 CAACCGGCGA CGAAAATAAA CCGTTCATCC TGTTTGGCGA CGGCAAAGAA 

301 TACACCTACG TCGCCCAATC CGAACTTTTG GACGCGCAGG GCAACAACAT 

351 TCTAAAAGGC ATCGGCTTTA GCGCACCGAA AAAACAGTAC AGCTTGGAAG 
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4 01 GCGACAAAGT TGAAGTCCGC CTGAGCGCGC CTGAAACACG CGGTCTGAAA 

4 51 ATCGACAAAG TTTATACTTT CACCAAAGGC AGCTATCTGG TCAACGTCCG 

501 CTTCGACATC GCCAACGGCA GCGGTCAAAC CGCCAACCTG AGCGCGGACT 

551 ACCGCATCGT CCGCGACCAC AGCGAACCCG AGGGTCAAGG TTACTTTACC 

601 CACTCTTACG TCGGCCCTGT TGTTTATACC CCTGAAGGCA ACTTCCAAAA 

651 AGTCAGCTTT TCCGACTTGG ACGACGATGC CAAATCCGGC AAATCCGAGG 

701 CCGAATACAT CCGCAAAACC CCGACCGGCT GGCTCGGCAT GATTGAACAC 
751 ' CACTTCATGT CCACCTGGAT TCTCCAACCT AAAGGCAGAC AAAGCGTTTG 

801 CGCCGCAGGC GAGTGCAACA TCGACATCAA ACGCCGCAAC GACAAGCTGT 

851 ACAGCACCAG CGTCAGCGTG CCTTTAGCCG CCATCCAAAA CGGCGCGAAA 

901 GCCGAAGCCT CCATCAACCT CTACGCCGGC CCGCAGACCA CATCCGTCAT 

951 CGCAAACATC GCCGACAACC TGCAACTGGC CAAAGACTAC GGCAAAGTAC 

1001 ACTGGTTCGC CTCCCCGCTC TTCTGGCTCC TGAACCAACT GCACAACATC 

1051 ATCGGCAACT GGGGCTGGGC GATTATCGTT TTAACCATCA TCGTCAAAGC 

1101 CGTACTGTAT CCATTGACCA ACGCCTCTTA CCGCTCTATG GCGAAAATGC 

1151 GTGCCGCCGC ACCCAAACTG CAAGCCATCA AAGAGAAATA CGGCGACGAC 

12 01 CGTATGGCGC AACAACAGGC GATGATGCAG CTTTACACAG ACGAGAAAAT 

1251 CAACCCGCTG GGCGGCTGCC TGCCTATGCT GTTGCAAATC CCCGTCTTCA 

1301 TCGGATTGTA TTGGGCATTG TTCGCCTCCG TAGAATTGCG CCAGGCACCT 

1351 TGGCTGGGTT GGATTACCGA CCTCAGCCGC GCCGACCCCT ACTACATCCT 

14 01 GCCCATCATT ATGGCGGCAA CGATGTTCGC CCAAACTTAT CTGAACCCGC 

14 51 CGCCGACCGA CCCGATGCAG GCGAAAATGA TGAAAATCAT GCCGTTGGTT 

1501 TTCTCCGTCA TGTTCTTCTT CTTCCCTGCC GGTCTGGTAT TGTACTGGGT 

1551 AGTCAACAAC CTCCTGACCA TCGCCCAGCA ATGGCACATC AACCGCAGCA 

1601 TCGAAAAACA ACGCGCCCAA GGCGAAGTCG TTTCCTAA 

This corresponds to the amino acid sequence [<SEQ ID 52; ORF1 1-1>] (SEP ID NO: 52;ORFll- 
11: 



1 MDFKRLTAFF AIALVIMIGW EKMFPTPKPV PAPQQAAQQQ AVTASAEAAL 

51 APATPITVTT DTVQAVIDEK SGDLRRLTLL KYKATGDENK PFILFGDGKE 

101 YTYVAQSELL DAQGNNILKG IGFSAPKKQY SLEGDKVEVR LSAPETRGLK 

151 IDKVYTFTKG SYLVNVRFDI ANGSGQTANL SADYRIVRDH SEPEGQGYFT 

201 HSYVGPWYT PEGNFQKVSF SDLDDDAKSG KSEAEYIRKT PTGWLGMIEH 

251 HFMSTWILQP KGRQSVCAAG ECNIDIKRRN DKLYSTSVSV PLAAIQNGAK 

3 01 AEASINLYAG PQTTSVIANI ADNLQLAKDY GKVHWFAS PL FWLLNQLHNI 
351 IGNWGW AIIV LTIIVKAVLY PLTN ASYRSM AKMRAAAPKL QAIKEKYGDD 

4 01 RMAQQQAMMQ LYTDEKINPL GGCLP MLLQI PVFIGLYWAL FA SVELRQAP 
451 WLGWITDLSR ADPYYILPII MAATMFAQTY LNPPPTDPMQ AKMMKIMPLV 
501 FSVMFFFFPA GLVLYW WNN LLTIAQQWHI NRSIEKQRAQ GEWS* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with a 60kDa inner-membrane protein (accession P25754) (SEP ID NO: 1117) of 
Pseudomonas putida 

ORF1 1 (SEP ID NO: 50) and the 60kDa protein (SEP ID NO: 1117) show 58% aa identity in 229 
aa overlap (BLASTp). 



ORFll 2 LYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNI IGNWGWAI IVLTI IVK 61 

LYAGP+ S + + + L+L DYG + + A P+FWLL +H+++GNWGW+IIVLT+++K 
60K 324 LYAGPKIQSKLKELSPGLELTVDYGFLWFIAQPIFWLLQHIHSLLGNWGWSIIVLTMLIK 383 
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ORF11 


62 


AVLYPLTNAS Y RSMAKMRAAAPK-LQAI k±jK.i (jUUKAAAAiwUUs-X.ij x I Uhjlvl W FIAjLtL.IjFM 


1 oi 










60K 


384 


GLFFPLSAASYRSMARMRAVAPKIjAALKERFGDDRQKMSQAMMELYKKEKINPLGGCLPI 


443 


ORF11 


122 


LLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLNPPPT 


181 






L+Q+PVF+ LYW L SVE+RQAPW+ WITDLS DP++ILPIIM ATMF Q LNP P 




60K 


444 


LVQMPVFLALYWVLLESVEMRQAPWILWITDLSIKDPFFILPIIMGATMFIQQRLNPTPP 


503 


ORF11 . 


182 


DPMQAKMMKI MPLVXXXXXXXX PAGXVL YWWNNLLT I AQQWH I NRS IE 23 0 








DPMQAK+MK+MP++ PAG VLYWWNN L+I+QQW+I R IE 




60K 


504 


DPMQAKVMKMMP I 1 FTFFFLWFPAGLVLYWWNNCLS I SQQWY I TRR IE 552 





10 Homology with a predicted ORF from N. meningitidis (strain A) 



ORF11 (SEP ID NO: 50) shows 97.9% identity over a 240aa overlap with an ORF (ORFlla) 
(SEP ID NO: 54) from strain A of N. meningitidis: 



10 20 30 

NLYAGPQTTSVIANIADNLQLAKDYGKVHW 

lllllllllllllll I I MIIIMI 
IKRRNDKLYSTSVSVPLAAIQNGAKSXASINLYAGPQTTSVIANIADNLQLXKDYGKVHW 
280 290 300 310 320 330 

40 50 60 70 80 90 

FAS PLFWLLNQLHNI IGNWGWAI IVLTI IVKAVLYPLTNASYRSMAKMRAAAPKLQAIKE 

i I , I II II I II 1 1 1 1 1 M 1 1 1 < I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M ! 1 1 1 1 1 1 1 1 1 1 1 

FAS PLFWLLNQLHNI IGNWGWAI IVLTI IVKAVLYPLTNASYRSMAKMRAAAPKLQAIKE 
340 350 360 370 380 390 

100 110 120 130 140 150 

KYGDDRMAQQQAMMQLYTDEKINPLGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWI 

M IliMlllllllillIMM llllllllllllllllllllll MINIMUM 

KYGDDRMAQQQAMMQLYTDEKINPLGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWI 
400 410 420 430 440 450 

160 170 180 190 200 210 

TDLSRADPYYILPIIMAATMFAQTYLNPPPTDPMQAKMMKIMPLVFSXXFFFFPAGXVLY 

1 1 1 II I II 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 Mill 1 1 1 1 III 

TDLSRADPYYILPIIMAATMFAQTYLNPPPTDPMQAKMMKIMPLVXSXXFFXFPAGLVLY 
460 470 480 490 500 510 

220 230 240 

WWNNLLTIAQQWHINRSIEKQRAQGEWSX 

h M I I I I I I I I I I I I I I I I I I I I I I I I I 
WV I NNLLT I AQQWH I NRS I E KQRAQGE WSX 
520 530 540 

The complete length PRF1 la nucleotide sequence [<SEQ ID 53>] (SEP ID NP: 53) is: 

40 1 ANGGATTTTA AAAGACTCAC NGNGTTTTTC GCCATCGCAC TGGTGATTAT 

51 GATCGGATNG NAAANGATGT TCCCCACTCC GAAGCCCGTC CCCGCGCCCC 

101 AACAGACGGC ACAACAACAG GCCGTAANCG CTTCCGCCGA AGCCGCGCTC 

151 GCGCCCGNAN CGCCGATTAC CGTAACGACC GACACGGTTC AAGCCGTCAT 

201 TGATGAAAAA AGCGGCGACC TGCGCCGGCT GACCCTGCTC AAATACAAAG 

45 251 CAACCGGCGA CNAAAATAAA CCGTTCATCC TGTTTGGCGA CGGCAAANAA 



orf 11 .pep 

15 

orf lla 



orf 11 .pep 

20 

orflla 



orf 11 .pep 

25 

orflla 



orf 11 .pep 

30 

orflla 



orf 11 .pep 

35 

orflla 
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301 TACACCTACN TCGCCCANTC CGAACTTTTG GACGCGCAGG GCAACAACAT 

351 TCTAAAAGGC ATCGGCTTTA GCGCACCGAA AAAACAGTAC AGCTTGGAAG 

4 01 GCGACAAAGT TGAAGTCCGC CTGAGCGCAC CTGAAACACG CGGTCTGAAA 

451 ATCGACAAAG TTTATACTTT CACCAAAGGC AGCTATCTGG TCAACGTCCG 

5 501 CTTCGACATC GCCAACGGCA GCGGTCAAAC CGCCAACCTG AGCGCGGACT 

551 ACCGCATCGT CCGCGACCAC AGCGAACCCG AGGGTCAAGG CTACTTTACC 

601 CACTCTTACG TCGGCCCTGT TGTTTATACC CCTGAAGGCA ACTTCCAAAA 

651 AGTCAGCTTC TCCGACTTGG ACGACGATGC CAANTCCGGN AAATCCGAGG 

701 CCGAATACAT CCGCAAAACC CNGACCGGCT GGCTCGGCAT GATTGAACAC 

10 751 CACTTCATGT CCACCTGGAT CCTCCAACCC AAAGGCGGAC AAAGCGTTTG 

801 CGCCGCTGGC GACTGCNGTA TNGACATCAA ACGCCGCAAC GACAAGCTGT 

851 ACAGCACCAG CGTCAGCGTG CCTTTAGCCG CTATCCAAAA CGGTGCGAAA 

901 TCCNAAGCCT CCATCAACCT CTACGCCGGC CCACAGACCA CATCNGTTAT 

951 CGCAAACATC GCCGACAACC TGCAACTGGN CAAAGACTAC GGCAAAGTAC 

15 1001 ACTGGTTCGC CTCCCCCCTC TTTTGGCTTT TGAACCAACT GCACAACATC 

1051 ATCGGCAACT GGGGCTGGGC GATTATCGTT TTAACCATCA TCGTCAAAGC 

1101 CGTACTGTAT CCATTGACCA ACGCCTCTTA CCGTTCGATG GCGAAAATGC 

1151 GTGCCGCCGC GCCCAAACTG CAAGCCATCA AAGAGAAATA CGGCGACGAC 

1201 CGTATGGCGC AGCAACAAGC CATGATGCAG CTTTACACAG ACGAGAAAAT 

20 1251 CAACCCGCTG GGCGGCTGCC TGCCTATGCT GTTGCAAATC CCCGTCTTCA 

1301 TCGGATTGTA TTGGGCATTG TTCGCCTCCG TAGAATTGCG CCAGGCACCT 

1351 TGGCTGGGTT GGATTACCGA CCTCAGCCGC GCCGACCCNT ACTACATCCT 

14 01 GCCCATCATT ATGGCGGCAA CGATGTTCGC CCAAACCTAT CTGAACCCGC 

14 51 CGCCGACCGA CCCGATGCAG GCGAAAATGA TGAAAATCAT GCCTTTGGTT 

25 1501 NTNTCNNNNA NGTTCTTCNN CTTCCCTGCC GGTCTGGTAT TGTACTGGGT 

1551 GATCAACAAC CTCCTGACCA TCGCCCAGCA ATGGCACATC AACCGCAGCA 

1601 TCGAAAAACA ACGCGCCCAA GGCGAAGTCG TTTCCTAA 

This encodes a protein having amino acid sequence [<SEQ ID 54>] (SEP ID NO: 54) : 

30 1 XDFKRLTXFF AIALVIMIGX XXMFPTPKPV PAPQQTAQQQ AVXASAEAAL 

51 APXXPITVTT DTVQAVIDEK SGDLRRLTLL KYKATGDXNK PFILFGDGKX 

101 YTYXAXSELL DAQGNNILKG IGFSAPKKQY SLEGDKVEVR LSAPETRGLK 

151 IDKVYTFTKG SYLVNVRFDI ANGSGQTANL SADYRIVRDH SEPEGQGYFT 

201 HSYVGPWYT PEGNFQKVSF SDLDDDAXSG KSEAEYIRKT XTGWLGMIEH 

35 251 HFMSTWILQP KGGQSVCAAG DCXXDIKRRN DKLYSTSVSV PLAAIQNGAK 

301 SXASINLYAG PQTTSVIANI ADNLQLXKDY GKVHWFASPL FWLLNQLHNI 

3 51 IGNWGW AIIV LTIIVKAVLY PLTN ASYRSM AKMRAAAPKL QAI KEKYGDD 
401 RMAQQQAMMQ LYTDEKINPL GGCLP MLLQI PVFIGLYWAL FA SVELRQAP 

4 51 WLGWITDLSR ADPYYILPII MAATMFAQTY LNPPPTDPMQ AKMMKIMPLV 
40 501 XSXXFFXFPA GLVLYW VINN LLTIAQQWHI NRSIEKQRAQ GEWS* 

ORFl la (SEP ID NO: 54) and ORF1 1-1 (SEP ID NO: 52) show 95.2% identity in 544 aa overlap: 



10 20 30 40 50 60 

orf 11a. pep XDFKRLTXFFAIALVIMIGXXXMFPTPKPVPAPQQTAQQQAVXASAEAALAPXXPITVTT 

45 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 llllllllllllhlllllhlllllllll • 1 1 1 1 1 1 

orf 11-1 MDFKRLTAFFAIALVIMIGWEKMFPTPKPVPAPQQAAQQQAVTASAEAALAPATPITVTT 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 11a . pep DTVQAVIDEKSGDLRRLTLLKYKATGDXNKPFILFGDGKXYTYXAXSELLDAQGNNILKG 

50 Illllllllllllllllllllllllll lllllllllll MM MM MIMII III 

orf 11 - 1 DTVQAVIDEKSGDLRRLTLLKYKATGDENKPFILFGDGKEYTYVAQSELLDAQGNNILKG 

70 80 90 100 110 120 



orf 11a .pep 



130 140 150 160 170 180 

I GFSAPKKQYSLEGDKVEVRLSAPETRGLKIDKVYTFTKGSYLVNVRFD I ANGSGQTANL 
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I II II MM I III II II 1 1 II MM iMMI II MM II II II I! I III III II 1 1 1 III 

orf 11-1 IGFSAPKKQYSLEGDKVEVRLSAPETRGLKIDKVYTFTKGSYLVNVRFDIANGSGQTANL 

130 140 150 160 170 180 

190 200 210 220 230 240 

5 orf 11a. pep SADYRIVRDHSEPEGQGYFTHSYVGPWYTPEGNFQKVSFSDLDDDAXSGKSEAEYIRKT 

Ml I M II II M I M 1 1 1 1 M II 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 Ml IIMMMMM 

orf 11-1 SADYRIVRDHSEPEGQGYFTHSYVGPWYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKT 

190 200 210 220 230 240 

250 260 270 280 290 300 

] 0 orf lla . pep XTGWLGMIEHHFMSTWILQPKGGQSVCAAGDCXXDIKRRNDKLYSTSVSVPLAAIQNGAK 

1 1 M 1 1 1 1 1 Mi 1 1 II I II i I I II I M M I IMIMMIM Mill Ml MM 

orf 11-1 PTGWLGMIEHHFMSTWILQPKGRQSVCAAGECNIDIKRRNDKLYSTSVSVPLAAIQNGAK 

250 260 270 280 290 300 

310 320 330 340 350 360 

1 5 orf lla .pep SXASINLYAGPQTTSVIANIADNLQLXKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIIV 

: II 1 1 1 II I II 1 1 1 1 1 1 1 1 1 1 1 1 1 IMMMMIM I I I I I MIMMM 

orf 11-1 AEASINLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNI IGNWGWAI IV 

310 320 330 340 350 360 

370 380 390 400 410 420 

20 orf lla .pep LTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRMAQQQAMMQLYTDEKINPL 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 ; I I I I I 

orf 11-1 LTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRMAQQQAMMQLYTDEKINPL 

370 380 390 400 410 420 

430 440 450 460 470 480 

25 orf lla. pep GGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTY 

I I I I I I I I I I I I I I I I I I I : I I I I I : I : I I I I I I I I I 

orf 11-1 GGCLPMLLQ I PVF I GLYWALFAS VELRQAPWLGW I TDLSRADP YY I LP I IMAATMFAQTY 

430 44 0 450 460 470 480 

490 500 510 520 530 540 

30 orf lla . pep LNPPPTDPMQAKMMKIMPLVXSXXFFXFPAGLVLYWVINNLLTIAQQWHINRSIEKQRAQ 

1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I II M III MIMMM III I MIMMM 

orf 11-1 LNPPPTDPMQAKMMKIMPLVFSVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRAQ 

490 500 510 520 530 540 

35 orf lla. pep GEWSX 

Mill 

orf 11-1 GEWSX 

Homology with a predicted ORF from N. gonorrhoeae 

ORFU (SEP ID NO: 50) shows 96.3% identity over a 240aa overlap with a predicted ORF 
40 (ORF1 1 .ng) (SEP ID NO: 56) from N. gonorrhoeae: 



Orf 11 NLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNI IGNWGWAI IVLT 57 

1 1 1 1 1 1 M II 1 1 1 1 1 1 : 1 1 ! 1 1 I 1 i ! 1 1 i I 1 1 1 1 1 II 1 1 1 1 M M 1 1 1 M : M I 

orf 1 lng MAVNL Y AG PQTTS V I AN I ADNLQLAKD YGKVHWF AS PLFWLLNQLHNI IGNWGWAI WLT 6 0 
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or f 11 1 1 VKAVLYPLTNAS YRSMAKMRAAAPKLQAI KEKYGDDRMAQQQAMMQLYTDEKINPLGG 117 

I II 1 1 1 i 1 1 1 1 : 1 I I I M I Ml II hi h I I I I I I M I I I I I I M I I I : IMIIIII 
or f 1 lng 1 1 VKAVLYPLTNAS YRSMAKMRAAAPELQT I KEKYGDDRMAQQQAMMQLFEDEE INPLGG 120 

orf 11 CLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLN 177 

lllllllllhlllllllllllll llllll llllll llllll IIIIIIIIIIMI 

orf ling CLPMLLQ I PVF I GLYWALFAS VELRQAPWLGWI TDLSRADP YY I LP I IMAATMFAQTYLN 180 

orf 11 PPPTDPMQAKMMKIMPLVFSXXFFFFPAGXVLYWVVNNLLTIAQQWHINRSIEKQRAQGE 237 

I I I I I I I I I I I M I I I I I I I lllllll I ' I i I I I I M I I II I I I I I I I I I I I I I I 
orf ling PPPTDPMQAKI^KIMPLVFSVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRAQGE 24 0 

orfll WS 240 

III 

orf ling WS 243 

An ORFllng nucleotide sequence [<SEQ ID 55>] (SEP ID NO: 55) was predicted to encode a 
protein having amino acid sequence [<SEQ ED 56>] (SEP ID NO: 56) : 

1 MAVNLYAGPQ TTSVIANIAD NLQLAKDYGK VHWFASPLFW LLNQLHNIIG 

51 NWGW AIWLT IIVKAVLYPL T NASYRSMAK MRAAAPELQT IKEKYGDDRM 

101 AQQQAMMQLF EDEE INPLGG CLP MLLQIPV FIGLYWALFA SVELRQAPWL 

151 GWITDLSRAD PYYILPIIMA ATMFAQTYLN PPPTDPMQAK MMKIMPLVFS 

201 VMFFFFPAGL VLYW WNNLL TIAQQWHINR SIEKQRAQGE WS* 

Further sequence analysis revealed the complete gonococcal DNA sequence [<SEQ ID 57>] (SEP 
ID NP: 57) to be: 



1 


ATGGATTTTA 


AAAGACTCAC 


51 


GATCGGCTGG 


GAAAAAATGT 


101 


AACAGGCGGC 


ACAAAAACAG 


151 


GCGCCCGCAA 


CGCCGATTAC 


201 


TGATGAAAAA 


AGTGGCGACC 


251 


CAACCGGCGA 


CGAAAACAAA 


301 


TACACCTACG 


TCGCCCAATC 


351 


TCTGAAAGGC 


ATCGGCTTTA 


401 


GCGACACAGT 


CGAAGTCCGC 


451 


ATCGACAAAG 


TCTATACCTT 


501 


CTTCGACATC 


GCCAACGGCA 


551 


ACCGCATCGT 


CCGCGACCAC 


601 


CACTCTTACG 


TCGGCCCTGT 


651 


AGTCAGCTTC 


TCCgacTTgg 


701 


ccgaatacaT 


CCGCAAAACC 


751 


cacttcatgt 


ccacctggat 


801 


cgcccaggga 


gactgccgta 


851 


acagcgcaag 


cgtcagcgtg 


901 


aaaccgaaaa 


tggcggTCAA 


951 


TATCGCAAAC 


ATCGCcgacA 


1001 


TACACTGGTT 


CGCATCGCCG 


1051 


ATTATCGGCA 


ACTGGGGCTG 


1101 


AGCCGTACTG 


TATCCATTGA 


1151 


TGCGTGccgc 


cgcacCcaaA 


1201 


GACCGTATGG 


GGCAACAGCA 


1251 


AATCAACCCG 


CTGGGCGGCT 


1301 


TCATCGGCTT 


GTACTGGGCA 


1351 


CCTTGGCTGG 


GCTGGATTAC 



GGCGTTTTTC GCCATCGCGC TGGTGATTAT 
TCCCCACCCC GAAACCCGTC CCCGCGCCCC 
GCAGCAACCG CTTCCGCCGA AGCCGCGCTC 
CGTAACGACC GACACGGTTC AAGCCGTTAT 
TGCGCCGGCT GACCCTGCTC AAATACAAAG 
CCGTTCGTCC TGTTTGGCGA CGGCAAAGAA 
CGAACTTTTG GACGCGCAGG GCAACAACAT 
GCGCACCGAA AAAACAGTAC ACCCTCAACG 
CTGAGCGCGC CCGAAACCAA CGGACTGAAA 
TACCAAAGAC AGCTATCTGG TCAACGTCCG 
GCGGTCAAAC CGCCAACCTG AGCGCGGACT 
AGCGAACCCG AGGGTCAAGG CTACTTTACC 
TGTTTATACC CCTGAAGGCA ACTTCCAAAA 
acgACGATGC gaaaTccggc aaATccgagg 
ccgaccggtt ggctcggcat gattgaacac 
cctccAAcct aaaggcggcc aaaacgtttg 
tcgacattaa aCgccgcaac gacaagctgt 
cctttaaccg ctatcccaac ccgggggcca 
CCTGTATGCC GGTCCGCAAA CCACATCCGT 
ACCTGCAACT GGCAAAAGAC TACGGTAAAG 
CTCTTCTGGC TCCTGAACCA ACTGCACAAC 
GGCAATCGTC GTTTTGACCA TCATCGTCAA 
CCAACGcctc ctACCGTTCG ATGGCGAAAA 
CTGCAGACCA TCAAAGAAAA ATAcgGCGAC 
AGCGATGATG CAGCTTTACA AAgacgAGAA 
GTctgcctat gctgttgCAA ATCCCCGTCT 
TTGTTCGCCT CCGTAGAATT GCGCCAGGCA 
CGACCTCAGC CGCGCCGACC CCTACTACAT 
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14 01 CCTGCCCATC ATTATGGCGG CAACGATGTT CGCCCAAACC TATCTGAACC 

14 51 CGCCGCCGAC CGACCCGATG CAGGCGAAAA TGATGAAAAT CATGCCGTTG 

1501 GTTTTCTCCG TCATGTTCTT CTTCTTCCCT GCCGGTTTGG TTCTCTACTG 

1551 GGTGGTCAAC AACCTCCTGA CCATCGCCCA GCAGTGGCAC ATCAACCGCA 

5 1601 GCATCGAAAA ACAACGCGCC CAAGGCGAAG TCGTTTCCTA A 

This encodes a protein having amino acid sequence [<SEQ ID 58; ORFng-l>] (SEP ID NO: 58: 
ORFllng-1) : 

1 MDFKRLTAFF AIALVIMIGW EKMFPTPKPV PAPQQAAQKQ AATASAEAAL 
10 51 APATPITVTT DTVQAVIDEK SGDLRRLTLL KYKATGDENK PFVLFGDGKE 

101 YTYVAQSELL DAQGNNILKG IGFSAPKKQY TLNGDTVEVR LSAPETNGLK 
151 IDKVYTFTKD SYLVNVRFDI ANGSGQTANL SADYRIVRDH SEPEGQGYFT 
201 HSYVGPWYT PEGNFQKVSF SDLDDDAKSG KSEAEYIRKT PTGWLGMIEH 
251 HFMSTWILQP KGGQNVCAQG DCRIDIKRRN DKLYSASVSV PLTAIPTRGP 
15 3 01 KPKMAVNLYA GPQTTSVIAN IADNLQLAKD YGKVHWFASP LFWLLNQLHN 

351 IIGNWGW AIV VLTIIVKAVL YPLTN ASYRS MAKMRAAAPK LQTIKEKYGD 
4 01 DRMAQQQAMM QLYKDEKINP LGGCLP MLLQ IPVFIGLYWA LFA SVELRQA 
451 PWLGWITDLS RADPYYILPI IMAATMFAQT YLNPPPTDPM QAKMMKIMPL 
501 VFSVMFFFFP AGLVLYW WN NLLTIAQQWH INRSIEKQRA QGEWS* 

20 

ORFllng-1 (SEP ID NO: 58) and ORF11-1 (SEP ID NO: 52) shown 95.1% identity in 546 aa 
overlap: 

10 20 30 40 50 60 

orf ling- 1 . pep MDFKRLTAFFAIALVIMIGWEKMFPTPKPVPAPQQAAQKQAATASAEAALAPATPITVTT 

25 | | | | | | M | | | | | | | | | | | | | | | | M | | | M | | | | | | | : | | : | | I I I I I I I I I I I I I I I I 

orf 11-1 MDFKRLTAFFAIALVIMIGWEKMFPTPKPVPAPQQAAQQQAVTASAEAALAPATPITVTT 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf ling- 1 . pep DTVQAVIDEKSGDLRRLTLLKYKATGDENKPFVLFGDGKEYTYVAQSELLDAQGNNILKG 

30 || | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | : | I I I I I I I I I I I I I I I I II I I I I I I I I 

orf 11-1 DTVQAVIDEKSGDLRRLTLLKYKATGDENKPFILFGDGKEYTYVAQSELLDAQGNNILKG 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf ling- 1 . pep IGFSAPKKQYTLNGDTVEVRLSAPETNGLKIDKVYTFTKDSYLVNVRFDI ANGSGQTANL 

35 III 1 1 MM h hi I Mill Mill I MINIM III I M 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 

orf 11-1 IGFSAPKKQYSLEGDKVEVRLSAPETRGLKIDKVYTFTKGSYLVNVRFDI ANGSGQTANL 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf ling- 1 . pep SADYRIVRDHSEPEGQGYFTHSYVGPWYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKT 

40 I I I 1 I t I 1 t I I I 1 I 1 I I I I 1 1 I I I I I I I 1 I I I I I 1 1 1 I I i t I I I I 1 t I 1 I I I 1 I I 1 I I I I 

orf 11-1 SADYRIVRDHSEPEGQGYFTHSYVGPWYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKT 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf ling- 1 .pep PTGWLGMIEHHFMSTWILQPKGGQNVCAQGDCRIDIKRRNDKLYSASVSVPLTAIPTRGP 

45 II M II II M II M I I II II 1 1 hill hi II 1 1 II 1 1 II I h II I II h I I : I 

orf 11-1 PTGWLGMIEHHFMSTWILQPKGRQSVCAAGECNIDIKRRNDKLYSTSVSVPLAAIQN-GA 

250 260 270 280 290 

310 320 330 340 350 360 

orf ling- 1 .pep KPKMAWLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIV 
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orf 11-1 KAEASINLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNI IGNWGWAI I 

300 310 320 330 340 350 

370 380 390 400 410 420 

5 orf ling- 1 .pep VLTIIVKAVLYPLTNASYRSMAKMRAAAPKLQTIKEKYGDDRMAQQQAMMQLYKDEKINP 

MINIMI IIIMIIIMMIIIIIIIMMIIIIIMI MUM Ml II III 

or f 1 1 - 1 VLTI I VKAVLYPLTNAS YRSMAKMRAAAPKLQAI KEKYGDDRMAQQQAMMQLYTDEKINP 

360 370 380 390 400 410 



430 440 450 460 470 480 

10 orf ling- 1. pep LGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQT 

1 1 1 Ml 1 1 M 1 1 1 1 1 1 1 1 i 1 1 1 M 1 1 1 1 M 1 1 1 1 1' I M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M I 

orf 11 - 1 LGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPI IMAATMFAQT 

420 430 440 450 460 470 



490 500 510 520 530 540 

1 5 orf ling- 1 . pep YLNPPPTDPMQAKMMKIMPLVFSVMFFFFPAGLVLYWVWNLLTIAQQWHINRSIEKQRA 

I I I I I I I I I I I I I I I I I I I I I I I I I . I I I I I I I I I I I I I I I I 

orf 11-1 YLNPPPTDPMQAKMMKIMPLVFSVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRA 
480 490 500 510 520 530 



20 orf llng-1 .pep QGEWSX 

MMIM 

orf 11-1 QGEWSX 
54 0 

25 In addition, ORFllng-1 (SEP ID NO: 58) shows significant homology with an inner-membrane 
protein from the database (accession number p25754) (SEP ID NO: 1 117) : 



ID 60IM_PSEPU STANDARD; PRT; 560 AA. 

AC P25754; 

DT 01-MAY-1992 (REL. 22, CREATED) 

30 DT 01-MAY-1992 (REL . 22, LAST SEQUENCE UPDATE) 

DT 01-NOV-1995 (REL. 32, LAST ANNOTATION UPDATE) 

DE 60 KD INNER-MEMBRANE PROTEIN. . . . 

SCORES Initl: 1074 Initn: 1293 Opt: 1103 

Smith-Waterman score: 1406; 41.5% identity in 574 aa overlap 

35 10 20 30 40 

orf ling- 1 . pep MDFKR- - -LTAFFAIALVIMIGW EKMFPT PKPVPAPQQAAQKQ 

||:|| ::|= |::: I = -II I III = = = 1 = .: 

p25754 MDIKRTILIAALAWSYVMVLKWNDDYGQAALPTQNTAASTVAPGLPDGVPAGNNGASAD 

10 20 30 40 50 60 

40 50 60 70 80 90 

orf 1 lng- 1 . pep AATAS AEAALAPAT PIT VTTDTVQAVIDEKSGDLRRLTLLKYKATGDE - NKPF 

: :|:M:: | =|s: ' | II- :|| = 11 = =1 = 1 I I 1=111 

p25754 VPSANAESSPAELAPVALSKDLIRVKTDVLELAIDPVGGDIVQLNLPKYPRRQDHPNIPF 

70 80 90 100 110 120 

45 100 * 110 120 130 140 

or f 1 1 ng - 1 . pep VLFGDGKE YT YVAQS ELLDAQGNN ILKGIG FSAPKKQYTL - NGD TVEVRLS APE 

II :| I :|:||| I = = l = == I = = | =1 = 1 I =1= =l= = = = l 
p25754 QLFDNGGERVYLAQSGLTGTDGPDA-RASGRPLYAAEQKSYQLADGQEQLWDLKFS - - - 
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130 140 150 160 170 

150 160 170 180 190 200 

orf ling- 1. pep TNGLKIDKVYTFTKDSYLVNVRFDIANGSGQTANLSADYRIVRDHS-EPEGQGYF-THSY 

||:: I : = l : 1=11 = 11 Mh I = :: || | :| :: | :| 

5 p2 5754 DNGVNYIKRFSFKRGEYDLNVSYLIDNQSGQAWNGNMFAQLKRDASGDPSSSTATGTATY 

180 190 200 210 220 230 

210 220 230 240 250 260 

orf ling- 1 . pep VGPWYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKTPTGWLGMIEHHFMSTWILQPKGG 

:| :::| ^lll-hl h= =1 == II- -hh-ll 1= ■ 

10 p25754 LGAALWTASEPYKKVSMKDID KGSLKE NVSGGWVAWLQHYFVTAWI - PAKSD 

240 250 260 270 280 

270 280 290 300 310 320 

orf ling- 1 . pep QNVCAQGDCRIDIKRRNDKLYSASVSVPLTAIPTRGPKPKMAVNLYAGPQTTSVIANIAD 
:|| :::::: | : : |: ::|: | | : :: llllh I : ::: 

15 p25754 NNV VQTRKDSQGNY 1 1 GYTGP VI S VPA - GGKVETS ALL YAGPKI QS KLKELS P 

290 300 310 320 330 

330 340 350 360 370 380 

orf ling- 1 .pep NLQLAKDYGKVHWF-ASPLFWLLNQLHNIIGNWGWAIVVLTIIVKAVLYPLTNASYRSMA 

:|:|: ||| : || I : I : I I I I = : = I : : : I I I I I = I = I I I = : : I = : = : II = I 1 I I I I I 
20 p25754 GLELTVDYGFL-WFIAQPIFWLLQHIHSLLGNWGWSIIVLTMLIKGLFFPLSAASYRSMA 

340 350 360 370 380 390 

390 400 410 420 430 440 

orf ling- 1 .pep KMRAAAPKLQTI KEKYGDDRMAQQQAMMQLYKDEKINPLGGCLPMLLQI PVFIGLYWALF 

:|||:|||| ::||::||||: -lllhlll I I I I I I I I I I I = I = I = I I I : = I I 1 = h 
25 p25754 RMRAVAPKLAALKERFGDDRQKMSQAMMELYKKEKINPLGGCLP I LVQMPVFLALYWVLL 

400 410 420 430 440 450 

450 460 470 480 490 500 

orf ling- 1 .pep ASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLNPPPTDPMQAKMMKIMPLVF 

I I I : f I I I I : IIIMI ll-MM hill I III I 1 I I I I : I I : ! I = : I 
30 p25754 ESVEMRQAPWILWITDLSIKDPFFILPIIMGATMFIQQRLNPTPPDPMQAKVMKMMPIIF 

460 470 480 490 500 510 

510 520 530 540 

orf ling- 1. pep SVMFFFFPAGLVLYWWNNLLTIAQQWHINRSIEKQRAQGEWSX 

: h - I I I I I I I I I I I M hhllhhl II 
35 p25754 TFFFLWFPAGLVLYWWNNCLSISQQWYITRRIEAATKKAAA 

520 530 540 550 560 

Based on this analysis, including the homology to an inner-membrane protein from P. putida and 
the predicted transmembrane domains (seen in both the meningococcal and gonoccal proteins), it is 
40 predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 8 



The following partial DNA sequence was identified in ^meningitidis [<SEQ ID 59>] (SEP ID 
NO: 59) : 
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1 . . GCCGTCTTAA TCATCGAATT ATTGACGGGA ACGGTTTATC TTTTGGTTGT 

51 NAGCGCGGCT TTGGCGGGTT CGGGCATTGC TTACGGGCTG ACCGGCAGTA 

101 CGCCTGCCGC CGTCTTGACC GNCGCTCTGC TTTCCGCGCT GGGTATTTNG 

151 TTCGTACACG CCAAAACCGC CGTTAGAAAA GTTGAAACGG ATTCATATCA 

2 01 GGATTTGGAT GCCGGACAAT ATGTCGAAAT CCTCCGNCAC ACAGGCGGCA 

251 ACCGTTACGA AGTT . TTTAT CGCGGTACG . ACTGGCAGGC TCAAAATACG 

301 GGGCAAGAAG AGCTTGAACC AGGAACTCGC GCCCTCATTG TCCGCAAGGA 

351 AGGCAACCTT CTTATTATCA CACACCCTTA A 

This corresponds to the amino acid sequence [<SEQ ID 60; ORF13>] (SEP ID NO: 60; ORF13) : 



1 . . AVLIIELLTG TVYLLWSAA LAGSGIAYGL TGSTPAAVLT XALLSALGIX 
51 FVHAKTAVRK VETDSYQDLD AGQYVEILRH TGGNRYEVXY RGTXWQAQNT 
101 GQEELEPGTR ALIVRKEGNL LIITHP* 

Further sequence analysis elaborated the DNA sequence slightly [<SEQ ID 61>] (SEP ID NO: 61) : 



1 . .GCCGTCTTAA TCATCGAATT ATTGACGGGA ACGGTTTATC TTTTGGTTGT 

51 nAGCGCGGCT TTGGCGGGTT CGGGCATTGC TTACGGGCTG ACCGGCAGTA 

101 CGCCTGCCGC CGTCTTGACC GnCGCTCTGC TTTCCGCGCT GGGTATTTnG 

151 TTCGTACACG CCAAAACCGC CGTTAGAAAA GTTGAAACGG ATTCATATCA 

201 GGATTTGGAT GCCGGACAAT ATGTCGAAAT CCTCCGACAC ACAGGCGGCA 

251 ACCGTTACGA AGTTTTtTAT CGCGGTACGc ACTGGCAGGC TCAAAATACG 

301 GGGCAAGAAG AGCTTGAACC AGGAACTCGC GCCCTCATTG TCCGCAAGGA 

351 AGGCAACCTT CTTATTATCA CACACCCTTA A 

i 

This corresponds to the amino acid sequence [<SEQ ID 62; ORF13-l>] (SEP ID NO: 62: ORF13- 
i): 

1 . .AVLIIELLTG TVYLLWSAA LAGSGIAYGL TGSTPAAVLT XALLSALGIX 

51 FVHAKTAVRK VETDSYQDLD AGQYVEILRH TGGNRYEVFY RGTHWQAQNT 

101 GQEELEPGTR ALIVRKEGNL LIITHP* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N.menineitidis (strain A) 

ORF13 (SEP ID NO: 60) shows 92.9% identity over a 126aa overlap with an ORF (ORF13a) 
fSEO ID NO: 64) from strain A of N. meningitidis: 



10 20 30 40 50 

orf 13 .pep AVL 1 1 ELLTGTVYLL WS AALAGSG I AYGLTGS T P AAVLTXALLS ALG I X F 

1 1 1 1 1 1 1 1 1 1 1 1 1 M I I ! i 1 1 1 1 M II 1 1 1 1 1 1 M M 1 1 llllllll I 

orf 13a MTVWFVAAVAVLI I ELLTGTVYLLWSAALAGSGIAYGLTGSTPAAVLTAALLSALGIWF 

10 20 30 40 50 60 

60 70 80 90 100 110 

orf 13 . pep VHAKTAVRKVETDSYQDLDAGQYVE I LRHTGGNRYEVXYRGTXWQAQNTGQEELEPGTRA 

Mlllll IIIMMIIIMIIM IIMMIMI 1 1 1 1 Mlllllllllllllll 

orf 13a VHAKTAVGKVETDSYQDLDAGQYAEILRHAGGNRYEVFYRGTHWQAQNTGQEELEPGTRA 

70 80 90 100 110 120 
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120 

orf 13 .pep LIVRKEGNLLI ITHPX 

I I I I I I I I I I I - 
orf 13a LIVRKEGNLLI IAKPX 

130 



The complete length ORF13a nucleotide sequence [<SEQ ID 63>] (SEP ID NO: 63) is: 

1 ATGACTGTAT GGTTTGTTGC CGCTGTTGCC GTCTTAATCA TCGAATTATT 

51 GACGGGAACG GTTTATCTTT TGGTTGTCAG CGCGGCTTTG GCGGGTTCGG 

10 101 GCATTGCTTA CGGGCTGACC GGCAGCACGC CTGCCGCCGT CTTGACCGCC 

151 GCTCTGCTTT CCGCGCTGGG TATTTGGTTC GTACACGCCA AAACCGCCGT 

2 01 GGGAAAAGTT GAAACGGATT CATATCAGGA TTTGGATGCC GGGCAATATG 

2 51 CCGAAATCCT CCGGCACGCA GGCGGCAACC GTTACGAAGT TTTTTATCGC 

301 GGTACGCACT GGCAGGCTCA AAATACGGGG CAAGAAGAGC TTGAACCAGG 

15 351 AACGCGCGCC CTAATCGTCC GCAAGGAAGG CAACCTTCTT ATCATCGCAA 

4 01 AACCTTAA 



This encodes a protein having amino acid sequence [<SEQ ID 64>] (SEP ID NO: 64) : 



1 MTVWFVAAVA VL 1 1 ELLTGT VYLLWSAAL AGSGIAYGLT GSTPAAVLTA 
20 51 ALLSALGIWF VHAKTAVGKV ETDSYQDLDA GQYAEILRHA GGNRYEVFYR 

101 GTHWQAQNTG QEELEPGTRA LIVRKEGNLL IIAKP* 

ORF13a (SEP ID NO: 64) and ORF13-1 (SEP ID NO: 62) show 94.4% identity in 126 aa overlap 



10 20 30 40 50 60 

25 orf 13a .pep MTVWFVAAVAVL 1 1 ELLTGTVYLLWSAALAGSG I AYGLTGSTPAAVLTAALLS ALG I WF 

I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 : 1 I I I I I M I I I I llllllil I 

orf 13-1 AVLIIELLTGTVYLLWSAALAGSGIAYGLTGSTPAAVLTXALLSALGIXF 

10 20 30 40 50 

70 80 90 100 110 120 

30 orf 13a . pep VHAKTAVGKVETDS YQDLDAGQYAE I LRHAGGNRYEVFYRGTHWQAQNTGQEELEPGTRA 

Illllll I I I ; I I I I I I I 1 I hi I I I h I I I I II I I I I I I I I I I M I II I I I I I ' 
orf 13 - 1 VHAKTAVRKVETDS YQDLDAGQYVE I LRHTGGNRYE VFYRGTHWQAQNTGQEELE PGTRA 

60 70 80 90 100 110 

130 

35 orfl3a.pep LIVRKEGNLLI IAKPX 

llllllllll -II 
orf 13 - 1 LIVRKEGNLLI ITHPX 

120 

Homology with a predicted ORF from N. gonorrhoeae 

40 ORF13 (SEP ID NO: 60) shows 89.7% identity over a 126aa overlap with a predicted PRF 
(PRF13.ng) (SEP ID NP: 66) from N. gonorrhoeae: 



orf 13 AVL 1 1 ELLTGTVYLLWSAALAGSG I AYGLTGSTPAAVLTXALLSALGIXF 51 

I I I I M I I I I I I I I I I I II I I I I I I ' I M I I I I I I I I I I I llllllil I 
orf 13ng MTVWFVAAVAVL 1 1 ELLTGTVYLLWSAALAGSG I AYGLTGSTPAAVLTAALLS ALG I WF 60 
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orfl3 VHAKTAVRKVETDSYQDLDAGQYVEILRHTGGNRYEVXYRGTXWQAQNTGQEELEPGTRA 111 

Illllll llllllllllhhhillhllllllll I I I I Illllllll :||llll 
or f 1 3 ng VHAKTAVGKVETDS YQDLDTGKYAE I LRYTGGNRYEVFYRGTHWQAQNTGQEVFEPGTRA 120 

orfl3 LIVRKEGNLLI ITHP 126 

5 lllllllllllh:| 

orfl3ng LIVRKEGNLLI IANP 135 

The complete length ORF13ng nucleotide sequence [<SEQ ID 65>] (SEP ID NO: 65) is: 

1 ATGACTGTAT GGTTTGTTGC CGCTGTTGCC GTCTTAATCA TCGAATTATT 

10 51 GACGGGAACG GTTTATCTTT TGGTTGTCAG CGCGGCTTTG GCGGGTTCGG 

101 GCATTGCCTA CGGGCTGACT GGCAGCACGC CTGCCGCCGT CTTGACCGCC 

151 GCACTGCTTT CCGCGCTGGG CATTTGGTTC GTACATGCCA AAACCGCCGT 

201 GGGAAAAGTT GAAACGGATT CATATCAGGA TTTGGATACC GGAAAATATG 

251 CCGAAATCCT CCGATACACA GGCGGCAACC GTTACGAAGT TTTTTATCGC 

15 301 GGTACGCACT GGCAGGCGCA AAATACGGGG CAGGAAGTGT TTGAACCGGG 

3 51 AACGCGCGCC CTCATCGTCC GCAAAGAAGG TAACCTTCTT ATCATCGCAA 

4 01 ACCCTTAA 

This encodes a protein having amino acid sequence [<SEQ ID 66>] (SEP ID NO: 66) : 

20 1 MTVWFVAAVA VLIIELLTGT VYLLWSAAL AGSGIAYGLT GSTPAAVLTA 

51 ALLSALGIWF VHAKTAVGKV ETDSYQDLDT GKYAEILRYT GGNRYEVFYR 
101 GTHWQAQNTG QEVFEPGTRA LIVRKEGNLL I IANP* 

ORF13ng (SEP ID NO: 66) shows 91.3% identity in 126 aa overlap with ORF13-1 (SEP ID NO: 
25 621: 

10 20 30 40 50 

orf 13 - 1 . pep AVLI IELLTGTVYLLWSAALAGSGIAYGLTGSTPAAVLTXALLSALGIXF 

M 1 1 1 1 1 1 1 M 1 1 1 1 1 i 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 : llllllll I 

orf 13ng MTVWFVAAVAVLIIELLTGTVYLLWSAALAGSGIAYGLTGSTPAAVLTAALLSALGIWF 
30 10 20 30 40 50 60 

60 70 80 90 100 110 

orf 13-1. pep VHAKTAVRKVETDS YQDLDAGQYVEI LRHTGGNRYEVFYRGTHWQAQNTGQEELEPGTRA 
Illllll ■ I I I II I I I I M :M I I hi I I I I I I I I I I I I I i ! I M I I hill I 
orf 1 3ng VHAKTAVGKVETDS YQDLDTGKYAE I LRYTGGNRYEVFYRGTHWQAQNTGQEVFEPGTRA 

35 70 80 90 100 110 120 

120 

or f 1 3 - 1 . pep LIVRKEGNLL IITHPX 

llllllll III-! 
orfl3ng LIVRKEGNLLI I ANPX 

40 130 

Based on this analysis, including the extensive leader sequence in this protein, it is predicted that 
PRF13 fSEP ID NP: 60) and PRF13ng (SEP ID NP: 66) are likely to be outer membrane 
proteins. It is thus predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their 
epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



CHIR-0160 (356.001) 



-121- 



PATENT 



Example 9 



The following DNA sequence was identified in N. meningitidis [<SEQ ID 67>] (SEP ID NO: 67) : 



1 ATGTwTGATT TCGGTTTrGG CGArCTGGTT TTTGTCGGCA TTATCGCCCT 

51 GATwGtCCTC GGCCCCGAAC GCsTGCCCGA GGCCGCCCGC AyCGCCGGAC 

101 GGcTCATCGG CAGGCTGCAA CGCTTTGTCG GcAGCGTCAA ACAGGAATTT 

151 GACACTCAAA TCGAACTGGA AGAACTGAGG AAGGCAAAGC AGGAATTTGA 

201 AGCTGCCGcC GCTCAGGTTC GAGACAGCCT CAAAGAAACC GGTACGGATA 

251 TGGAAGGCAA TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA 

301 CTGCCCGAAC AGCGGACACC TGCCGATTTC GGTGTCGATG AAAACGGCAA 

351 TCCGCT.TCC CGATGCGGCA AACACCCTAT CAGACGGCAT TTCCGACGTT 

401 ATGCCGTC . . 

This corresponds to the amino acid sequence [<SEQ ID 68; ORF2>] (SEP ID NO: 68; PRF2) : 



1 MXDFGLGELV FVGIIALIVL GPERXPEAAR XAGRLIGRLQ RFVGSVKQEF 
51 DTQIELEELR KAKQEFEAAA AQVRDSLKET GTDMEGNLHD ISDGLKPWEK 
101 LPEQRTPADF GVDENGNPXS RCGKHPIRRH FRRYAV. . 

Further work revealed the complete nucleotide sequence [<SEQ ID 69>] (SEP ID NP: 69) : 



1 ATGTTTGATT TCGGTTTGGG CGAGCTGGTT TTTGTCGGCA TTATCGCCCT 

51 GATTGTCCTC GGCCCCGAAC GCCTGCCCGA GGCCGCCCGC ACCGCCGGAC 

101 GGCTCATCGG CAGGCTGCAA CGCTTTGTCG GCAGCGTCAA ACAGGAATTT 

151 GACACTCAAA TCGAACTGGA AGAACTGAGG AAGGCAAAGC AGGAATTTGA 

201 AGCTGCCGCC GCTCAGGTTC GAGACAGCCT CAAAGAAACC GGTACGGATA 

251 TGGAAGGCAA TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA 

301 CTGCCCGAAC AGCGGACACC TGCCGATTTC GGTGTCGATG AAAACGGCAA 

351 TCCGCTTCCC GATGCGGCAA ACACCCTATC AGACGGCATT TCCGACGTTA 

4 01 TGCCGTCCGA ACGTTCCTAC GCTTCCGCCG AAACCCTTGG GGACAGCGGG 

4 51 CAAACCGGCA GTACAGCCGA ACCCGCGGAA ACCGACCAAG ACCGCGCATG 

501 GCGGGAATAC CTGACTGCTT CTGCCGCCGC ACCCGTCGTA CAGACCGTCG 

551 AAGTCAGCTA TATCGATACT GCTGTTGAAA CGCCTGTTCC GCACACCACT 

601 TCCCTGCGCA AACAGGCAAT AAGCCGCAAA CGCGATTTTC GTCCGAAACA 

651 CCGCGCCAAA CCTAAATTGC GCGTCCGTAA ATCATAA 

This corresponds to the amino acid sequence [<SEQ ID 70; PRF2-1>] (SEP ID NP: 70:PRF2-1) : 

1 MFD FGLGELV FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEF 

51 DTQIELEELR KAKQEFEAAA AQVRDSLKET GTDMEGNLHD ISDGLKPWEK 

101 LPEQRTPADF GVDENGNPLP DAANTLSDGI SDVMPSERSY ASAETLGDSG 

151 QTGSTAEPAE TDQDRAWREY LTASAAAPW QTVEVSYIDT AVETPVPHTT 

201 SLRKQAISRK RDFRPKHRAK PKLRVRKS* 

Further work identified the corresponding gene in strain A of N. meningitidis [<SEQ ID 71>] (SEQ 
IDNP:71 ): 



1 ATGTTTGATT TCGGTTTGGG CGAGCTGGTT TTTGTCGGCA TTATCGCCCT 

51 GATTGTCCTC GGCCCCGAAC GCCTGCCCGA GGCCGCCCGC ACCGCCGGAC 

101 GGCTCATCGG CAGGCTGCAA CGCTTTGTCG GCAGCGTCAA ACAGGAATTT 

151 GACACGCAAA TCGAACTGGA AGAACTAAGG AAGGCAAAGC AGGAATTTGA 

201 AGCTGCCGCT GCTCAGGTTC GAGACAGCCT CAAAGAAACC GGTACGGATA 
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251 TGGAGGGTAA TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA 

301 CTGCCCGAAC AGCGCACGCC TGCTGATTTC GGTGTCGATG AAAACGGCAA 

351 TCCCTTTCCC GATGCGGCAA ACACCCTATT AGACGGCATT TCCGACGTTA 

4 01 TGCCGTCCGA ACGTTCCTAC GCTTCCGCCG AAACCCTTGG GGACAGCGGG 

5 4 51 CAAACCGGCA GTACAGCCGA ACCCGCGGAA ACCGACCAAG ACCGTGCATG 

501 GCGGGAATAC CTGACTGCTT CTGCCGCCGC ACCCGTCGTA CAGACCGTCG 

551 AAGTCAGCTA TATCGATACC GCTGTTGAAA CCCCTGTTCC GCATACCACT 

601 TCGCTGCGTA AACAGGCAAT AAGCCGCAAA CGCGATTTGC GTCCTAAATC 

651 CCGCGCCAAA CCTAAATTGC GCGTCCGTAA ATCATAA 

10 

This encodes a protein having amino acid sequence [<SEQ ID 72; ORF2a>] (SEP ID NO: 72; 
PRF2a) : 



1 MFD FGLGELV FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEF 

51 DTQIELEELR KAKQEFEAAA AQVRDSLKET GTDMEGNLHD ISDGLKPWEK 

15 101 LPEQRTPADF GVDENGNPFP DAANTLLDGI SDVNPSERSY ASAETLGDSG 

151 QTGSTAEPAE TDQDRAWREY LTASAAAPW QTVEVSYIDT AVETPVPHTT 

2 01 SLRKQAISRK RDLRPKSRAK PKLRVRKS* 

The originally-identified partial strain B sequence (ORF2) (SEP ID NO: 68) shows 97.5% identity 
20 over a 1 1 8aa overlap with PRF2a (SEP ID NP: 72) : 

10 20 30 40 50 60 

MXD FGLGELVFVGIIALIVL GPERXPEAARXAGRLIGRLQRFVGSVKQEFDTQIELEELR 

I 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 lllhlllliri I IMIIIIIIIIIIIIM 

MFD FGLGELVFVGIIALIVL GPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR 
10 20 30 40 50 60 



orf 2 .pep 
orf 2a 

25 



70 80 90 100 110 120 

orf 2 . pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPXS 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
orf 2a KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPFP 
30 70 80 90 100 110 120 

130 

orf 2. pep RCGKHP I RRHFRRYAV 

orf 2a DAANTLLDGI SDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPW 

35 130 140 150 160 170 180 

The complete strain B sequence (PRF2-1) (SEP ID NP: 70) and PRF2a (SEP ID NP: 72) show 
98.2% identity in 228 aa overlap: 

orf 2a . pep MFDFGLGELVFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR 60 

40 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I II 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II I 

orf 2 - 1 M FD FGLGELVF VG I IALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR 6 0 

orf 2a .pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPFP 120 

IIIIIIIIMIIIIMII'MU II II IIIIIIIIIIIIIIIMMMIIM 

or f 2 - 1 KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPLP 12 0 



45 



orf 2a .pep 
orf 2-1 



DAANTLLDGI SDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPVV 180 

II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 

DAANTLSDGISDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPW 180 
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orf2a.pep QTVEVSYIDTAVETPVPHTTSLRKQAISRKRDLRPKSRAKPKLRVRKSX 22 9 

1 1 1 1! 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I : M I llllllllllll 

orf2-l QTVEVSYIDTAVETPVPHTTSLRKQAISRKRDFRPKHRAKPKLRVRKSX 229 

Further work identified a partial DNA sequence [<SEQ ID 73>] (SEP ID NO: 73) in 
5 N. gonorrhoeae encoding the following amino acid sequence [<SEQ ID 74; ORF2ng>] (SEP ID 
NP: 74: PRF2ng) : 

1 MFD FGLGELI FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEL 
51 DTQIELEELR KVKQAFEAAA AQVRDSLKET DTDMQNSLHD ISDGLKPWEK 
101 LPEQRTPADF GVDEKGNSLS RYGKHRIRRH FRRYAV* 

10 

Further work identified the complete gonococcal gene sequence [<SEQ ID 75>] (SEP ID NO: 
75): 



1 ATGTTTGATT TCGGTTTGGG CGAGCTGATT TTTGTCGGCA TTATCGCCCT 

51 GATTGTCCTT GGTCCAGAAC GCCTGCCCGA AGCCGCCCGC ACTGCCGGAC 

15 101 GGCTTATCGG CAGGCTGCAA CGCTTTGTAG GAAGCGTCAA ACAAGAACTT 

151 GACACTCAAA TCGAACTGGA AGAGCTGAGG AAGGTCAAGC AGGCATTCGA 

201 AGCTGCCGCC GCTCAGGTTC GAGACAGCCT CAAAGAAACC GATACGGATA 

251 TGCAGAACAG TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA 

301 CTGCCCGAAC AGCGCACGCc tgccgatttc gGTGTCGATg AAAacggcaa 

20 351 tccccttccc gATACGGCAA ACACCGTATC AGACGGCATT TCCGACGTTA 

401 TGCCGTCTGA ACGTTCCGAT ACTtccgcCG AAACCCTTGG GGACGACAGG 

4 51 CAAACCGGCA GTACAGCCGA ACCTGCGGAA ACCGACAAAG ACCGCGCATG 

501 GCGGGAATAC CTGactgctt ctgccgccgc acctgtcgta Cagagggccg 

551 tcgaagtcag ctaTATCGAT ACTGCTGTTG AAacgcctgT tccgcaCacc 

25 601 acttccctgc gcaAACAGGC AATAAACCGC AAACGCGATT TttgtccgaA 

651 ACACCGCGCC aAACCGAAat tgcgcgtcCG TAAATCATAA 

This encodes a protein having the amino acid sequence [<SEQ ID 76; ORF2ng-l>] (SEP ID NO: 
76:ORF2ng-lV . 

30 1 MFD FGLGELI FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEL 

51 DTQIELEELR KVKQAFEAAA AQVRDSLKET DTDMQNSLHD ISDGLKPWEK 
101 LPEQRTPADF GVDENGNPLP DTANTVSDGI SDVMPSERSD TSAETLGDDR 
151 QTGSTAEPAE TDKDRAWREY LTASAAAPW QRAVEVSYID TAVETPVPHT 
201 TSLRKQAINR KRDFCPKHRA KPKLRVRKS* 

35 

The originally-identified partial strain B sequence (ORF2) (SEP ID NO: 68) shows 87.5% identity 
over a 1 36aa overlap with PRF2ng (SEP ID NP: 74) : 

orf 2 . pep MXDFGLGELVFVGIIALIVLGPERXPEAARXAGRLIGRLQRFVGSVKQEFDTQIELEELR 60 

I I IIIIMUIIIIIIIII 1 1 1 M i I M M 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 

40 orf2ng MFDFGLGELIFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQELDTQIELEELR 60 

orf 2 .pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPXS 120 

hll IIMIIIIIIIIIM I I I M I II I I I M I I I I I I I I I 1 1 1 1 1 M I 
orf 2ng KVKQAFEAAAAQVRDSLKETDTDMQNSLHD I SDGLKPWEKLPEQRTPADFGVDEKGNS LP 12 0 
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orf2.pep RCGKHPIRRHFRRYAV 13 6 

I III llllllllll 
orf2ng RYGKHRIRRHFRRYAV 136 

The complete strain B and gonococcal sequences (ORF2-1 & ORF2ng-l) (SEP ID NO: 70 & SEP 
ID NO: 76) show 91 .1% identity in 229 aa overlap: 



10 



15 



20 



25 



30 



10 20 30 40 50 60 

orf 2 - 1 . pep MFDFGLGELVFVGI IALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR 

M IIIMIMII MM MM 1 1 II III MM II II II I III Mill IM III MM II 

orf2ng-l MFDFGLGELIFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQELDTQIELEELR 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 2 - 1 . pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPLP 

MM lllllllllllllll MM-IMIMMMM IIIIIMI IMIMI MUM 

orf 2ng-l KVKQAFEAAAAQVRDSLKETDTDMQNSLHDISDGLKPWEKLPEQRTPADFGVDENGNPLP 

70 80 90 100 110 120 

130' 140 150 160 170 180 

orf 2 - 1 . pep DAANTLSDGISDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPW 

hllhlllllllllllll :|lllllh lllllllllllhlllllllllllllllll 
orf 2ng-l DTANTVSDGISDVMPSERSDTSAETLGDDRQTGSTAEPAETDKDRAWREYLTASAAAPW 

130 140 150 160 170 180 

190 200 210 220 229 

orf 2 - 1 . pep Q-TVEVSYIDTAVETPVPHTTSLRKQAISRKRDFRPKHRAKPKLRVRKSX 

I : II I I I I M I I I I I I I I I I II I I I I I* I I I I I lllllllllllllll 
orf2ng-l QRAVEVSYIDTAVETPVPHTTSLRKQAINRKRDFCPKHRAKPKLRVRKSX 

190 200 210 220 230 

Computer analysis of these amino acid sequences indicates a transmembrane region (underlined), 
and also revealed homology (59% identity) between the gonococcal sequence and the TatB protein 
(SEP ID NO: 1118) of E.coli: 



35 



40 



gnl |PID|el292l81 (AJ005830) TatB protein [Escherichia coli] Length = 
Score = 56.6 bits (134), Expect = le-07 

Identities = 30/88 (34%), Positives = 52/88 (59%), Gaps = 1/88 (1%) 



171 



Query: 1 MFDFGLGELIFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQELDTQIELEELR 60 . 

MFD G EL+ V II L+VLGP+RLP A +T I L+ +V+ EL +++L+E + 
Sbjct: 1 MFDIGFSELLLVFIIGLWLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQ 60 

Query: 61 -KVKQAFEAAAAQVRDSLKETDTDMQNS 87 

+K+ +A+ + LK + +++ + 
Sbjct: 61 DSLKKVEKASLTNLTPELKASMDELRQA 88 

Based on this analysis, it was predicted that ORF2 (SEP ID NO: 68) , ORF2a (SEP ID NO: 72) and 
ORF2ng (SEP ID NP: 74) are likely to be membrane proteins and so the proteins from 
N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or 



diagnostics, or for raising antibodies. 
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ORF2-1 (SEP ID NO: 70) (16kDa) was cloned in pET and pGex vectors and expressed in E.coli, 
as described above. The products of protein expression and purification were analyzed by SDS- 
PAGE. Figure 3A shows the results of affinity purification of the GST-fusion protein, and Figure 
3B shows the results of expression of the His-fusion in E.coli. Purified GST-fusion protein was 
used to immunise mice, whose sera were used for Western blots (Figure 3C), ELISA (positive 
result), and FACS analysis (Figure 3D). These experiments confirm that ORF37-1 (SEP ID NO: 4 
) is a surface-exposed protein, and that it is a useful immunogen. 

Example 10 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 77>] (SEP ID 
NP: 77) : 

1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC 

51 CGC . TGCGGG ACACTGACAG GTATTCCATC GCATGGCGgA GkTAAACgCT 

101 TTgCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA 

151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC 

201 CACTATGGGC GACCAAGGTT CAGGcAGTTT GACAGGGGGG JCGCTACTCC 

251 ATTGATGCAC kGrTwCsTGG CGAATACATA AACAGCCCTG CCGTCCGTAC 

301 CGATTACACC TATCCACGTT ACGAAACCAC CGCTGAAACA ACATCAGGCG 

351 GTTTGACAGG TTTAACCACT TCTTTATCTA CACTTAATGC CCCTGCACTC 

4 01 TCTCGCACCC AATCAGACGG TAGCGGAAGT AAAAGCAGTC TGGGCTTAAA 

4 51 TATTGGCGGG ATGGGGGATT ATCGAAATGA AACCTTGACG ACTAACCCGC 

501 GCGACACTGC CTTTCTTTCC CACTTGGTAC AGACCGTATT TTTCCTGCGC 

551 GGCATAGACG TTGTTTCTCC TGCCAATGCC GATACAGATG TGTTTATTAA 

601 CATCGACGTA TTCGGAACGA TACGCAACAG AACCGAAATG. . 

This corresponds to the amino acid sequence [<SEQ ID 78; PRF15>] (SEP ID NP: 78; PRF15) : 

1 MQARLLIPIL FSVF ILSAC G TLTGIPSHGG XKRFAVEQEL VAASARAAVK - 

51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDAXXXG EYINSPAVRT 

101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN 

151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDWSP ANADTDVFIN 

201 IDVFGTIRNR TEM. . 

Further work revealed the complete nucleotide sequence [<SEQ ED 79>] (SEP ID NP: 79) : 

1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC 

51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGTAAACGCT 

101 TTGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA 

151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC 

201 CACTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA 

251 TTGATGCACT GATTCGTGGC GAATACATAA ACAGCCCTGC CGTCCGTACC 

301 GATTACACCT ATCCACGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG 

351 TTTGACAGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT 

401 CTCGCACCCA ATCAGACGGT AGCGGAAGTA AAAGCAGTCT GGGCTTAAAT 

4 51 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CTAACCCGCG 

501 CGACACTGCC TTTCTTTCCC ACTTGGTACA GACCGTATTT TTCCTGCGCG 
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551 GCATAGACGT TGTTTCTCCT GCCAATGCCG ATACAGATGT GTTTATTAAC 

601 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA 

651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA 

701 GAACCAATAA AAAATTGCTC ATCAAACCAA AAACCAATGC GTTTGAAGCT 

751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGGCCGTATA AAGTAAGCAA 

801 AGGAATTAAA CCGACGGAAG GATTAATGGT CGATTTCTCC GATATCCGAC 

851 CATACGGCAA TCATACGGGT AACTCCGCCC CATCCGTAGA GGCTGATAAC 

901 AGTCATGAGG GGTATGGATA CAGCGATGAA GTAGTGCGAC AACATAGACA 

951 AGGACAACCT TGA 

This corresponds to the amino acid sequence [<SEQ ID 80; ORF15-l>] (SEP ID NO: 80; ORF15- 
1* 

1 MQARLLIPIL FSVFILSA CG TLTGIPSHGG GKRFAVEQEL VAASARAAVK 

51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT 

101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN 

151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDWSP ANADTDVFIN 

201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA 

251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIRPYGNHTG NSAPSVEADN 

301 SHEGYGYSDE WRQHRQGQP * 

Further work identified the corresponding gene in strain A of ^meningitidis [<SEQ ID 81>] (SEP 
ID NO: 81) : 



1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC 

51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGTAAACGCT 

101 TTGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA 

151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC 

2 01 AACTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA 
251 TTGATGCACT GATTCGTGGC GAATACATAA ACAGCCCTGC CGTCCGTACC 

3 01 GATTACACCT ATCCACGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG 
351 TTTGACAGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT 

4 01 CGCGCACCCA ATCAGACGGT AGCGGAAGTA AAAGCAGTCT GGGCTTAAAT 
4 51 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CTAACCCGCG 
501 CGACACTGCC TTTCTTTCCC ACTTGGTACA GACCGTATTT TTCCTGCGCG 
551 GCATAGACGT TGTTTCTCCT GCCAATGCCG ATACGGATGT GTTTATTAAC 
601 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA 
651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA 
701 GAACCAATAA AAAATTGCTC ATCAAACCAA AAACCAATGC GTTTGAAGCT 
751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGACCGTATA AAGTAAGCAA 
801 AGGAATTAAA CCGACAGAAG GATTAATGGT CGATTTCTCC GATATCCAAC 
851 CATACGGCAA TCATATGGGT AACTCTGCCC CATCCGTAGA GGCTGATAAC 
901 AGTCATGAGG GGTATGGATA CAGCGATGAA GCAGTGCGAC GACATAGACA 
951 AGGGCAACCT TGA 

This encodes a protein having amino acid sequence [<SEQ ID 82; PRF15a>] (SEP ID NO: 82; 
ORF15a) : 



1 MQARLLIPIL FSVFILSA CG TLTGIPSHGG GKRFAVEQEL VAASARAAVK 

51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT 

101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN 

151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDWSP ANADTDVFIN 

201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA 

251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIQPYGNHMG NSAPSVEADN 
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301 SHEGYGYSDE AVRRHRQGQP * 

The originally-identified partial strain B sequence (ORF15) (SEP ID NO: 78) shows 98.1% 
identity over a 21 3aa overlap with PRF15a (SEP ID NO: 82) : 

5 10 20 30 40 50 60 

orf 15 .pep MQARLL IPILFSVFI LSA CGTLTG I PSHGGXKRFAVEQELVAAS ARAAVKDMDLQALHGR 

I I I M I Ml I I I I I I I i I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I 
orf 15a MQARLL IPILFSVFI LSA CGTLTG I PSHGGGKRFAVEQELVAAS ARAAVKDMDLQALHGR 

10 20 30 40 50 60 

10 70 80 90 100 110 120 

orf 15 . pep KVALYIATMGDQGSGSLTGGRYSIDAXXXGEYINSPAVRTDYTYPRYETTAETTSGGLTG 

lllllllllllllll II MINIMI II 1 1 1 1 1 1 M M II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II 

orf 15a KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG 

70 80 90 100 110 120 

15 130 140 150 160 170 180 

■ orf 1 5 . pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 

II I I II II II II I I I I I I I II M II I I I II M I II I II II M II I I M II M I I II I I II 
orf 15a LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 

130 140 150 160 170 180 

20 190 200 210 

orf 15 .pep FLRG I DWS PANADTDVF IN I DVFGT I RNRTEM 
I II I I I I I I I I II . I I I I M I I I I I I I I I M 
orf 15a FLRG ID WS PANADTDVF IN I DVFGT I RNRTEMHLYNAETLKAQTKLE YFAVDRTNKKLL 

190 200 210 220 230 240 

25 

The complete strain B sequence (PRF15-1) (SEP ID NP: 80) and PRF15a (SEP ID NP: 82) show 
98.8% identity in 320 aa overlap: 

10 20 30 40 50 60 

orf 15a . pep MQARLL IPILFSVFI LSACGTLTG I PSHGGGKRFAVEQELVAAS ARAAVKDMDLQALHGR 

30 || | M I I I I I I I I I II I II II II I II II II II I I I I II I II I I I I II II II I II I I II II 

orf 15-1 MQARLL IP ILFSVF ILSACGTLTGI PSHGGGKRFAVEQELVAAS ARAAVKDMDLQALHGR 

10 20 30 40 50 . 60 

70 80 90 100 110 120 

orf 15a . pep KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG 

35 1 1 II I 1 1 1 1 1 1 1 1 1 1 I I I 1 1 1 1 .11 1 1 1 1 1 1 1 II h I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 

orf 15-1 KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 15a. pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 

40 1 1 M M M M I 1 1 M M 1 1 II I I II II I 1 1 1 1 II II I II I II I II 1 1 II I 1 1 II II 1 1 1 1 

orf 15-1 LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 15a . pep FLRGI DWS PANADTDVF IN IDVFGT I RNRTEMHLYNAETLKAQTKLE YFAVDRTNKKLL 

45 || || | | | || I I I II I II I I I I I I I II II II II I I I M I I I II I II I II I I II II II II II 

orfl5-l FLRG I DWS PANADTDVF INI DVFGT I RNRTEMHL YNAETL KAQTKLE YFAVDRTNKKLL 

190 200 210 220 230 240 
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250 260 270 280 290 300 

orf 15a . pep IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIQPYGNHMGNSAPSVEADN 

1 1 M M 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 llllllllll 

orf 15-1 IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIRPYGNHTGNSAPSVEADN 

250 260 270 280 290 300 



310 320 
orf 15a . pep SHEGYGYSDEAVRRHRQGQPX 
llllllllll: MINIM 
orf 15-1 SHEGYGYSDEWRQHRQGQPX 

310 320 

Further work identified the corresponding gene in N. gonorrhoeae [<SEQ ID 83>] (SEP ID NO: 
83): 



1 ATGCGGGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC 

51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGCAAACGCT 

101 TCGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA 

151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC 

201 AACTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA 

251 TTGATGCACT GATTCGCGGC GAATACATAA ACAGCCCTGC CGTCCGCACC 

301 GATTACACCT ATCCGCGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG 

351 TTTGACGGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT 

4 01 CGCGCACCCA ATCAGACGGT AGCGGAAGTA GGAGCAGTCT GGGCTTAAAT 

451 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CCAACCCGCG 

501 CGACACTGCC TTTCTTTCCC ACTTGGTGCA GACCGTATTT TTCCTGCGCG 

551 GCATAGACGT TGTTTCTCCT GCCAATGCCG ATACAGATGT GTTTATTAAC 

601 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA 

651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA 

701 GAACCAATAA AAAATTGCTC ATCAAACCCA AAACCAATGC GTTTGAAGCT 

751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGGCCGTATA AAGTAAGCAA 

801 AGGAATCAAA CCGACGGAAG GATTGATGGT CGATTTCTCC GATATCCAAC 

851 CATACGGCAA TCATACGGGT AACTCCGCCC CATCCGTAGA GGCTGATAAC 

901 AGTCATGAGG GGTATGGATA CAGCGATGAA GCAGTGCGAC AACATAGACA 

951 AGGGCAACCT TGA 



This encodes a protein having amino acid sequence [<SEQ ID 84; ORF15ng>] (SEP ID NO: 84: 
ORF15ng) : 



1 MRARLLIPIL FSVF ILSAC G TLTGIPSHGG GKRFAVEQEL VAASARAAVK 

51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT 

101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSRSSLGLN 

151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDWSP ANADTDVFIN 

201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA 

251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIQPYGNHTG NSAPSVEADN 

301 SHEGYGYSDE AVRQHRQGQP * 

The originally-identified partial strain B sequence (ORF15) (SEP ID NO: 78) shows 97.2% 
identity over a 21 3aa overlap with PRF15ng (SEP ED NO: 84) : 



or f 1 5 . pep MQARLL I P I LFSVF I LSACGTLTG I PSHGGXKRFAVEQELVAAS ARAAVKDMDLQALHGR 6 0 

hIMIIIIIIIIIIIIMIIIIIIIIIII I I I I I I I I I I I I I M I I M I I I I I I I I II 
orf 15ng MRARLLIP I LFSVF I LSACGTLTG I PS HGGGKRFAVEQELVAAS ARAAVKDMDLQALHGR 60 
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orf 15 . pep KVALYIATMGDQGSGSLTGGRYSIDAXXXGEYINSPAVRTDYTYPRYETTAETTSGGLTG 120 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I I Ml I Ml 1 1 1 M 1 1 1 1 1 1 1 1 1 ! M 1 1 1 1 

or f 1 5 ng KVALY I ATMGDQGS GSLTGGRYS I DAL I RGEY I NS PAVRTD YT YPR YE TTAETTSGGLTG 120 

orf 15 .pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 180 

I I I I I I I I I I I I I M I I I I II M : I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I M 
orf 15ng LTTSLSTLNAPALSRTQSDGSGSRSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 180 

orf 15 .pep FLRG I DWS PANADTDVF INI DVFGT I RNRTEM 213 

M I M II 1 1 M 1 1 1 1 1 1 1 1 I II I! 1 1 

orf 15ng FLRG I DWS PANADTDVF IN IDVFGT I RNRTEMHLYNAETLKAQTKLE YFAVDRTNKKLL 240 

The complete strain B sequence (ORF15-1) (SEP ID NO: 80) and ORFlSng (SEP ID NO: 84) 
show 98.8% identity in 320 aa overlap: 



10 20 30 < 40 50 60 

or f 15 - 1 . pep MQARLL I P I LFS VF I LSACGTLTGI PSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR 

15 -M II 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 M I II 1 1 1 1 1 1 II 1 1 1 M I M 1 1 

orf 15ng MRARLLIPILFSVFI LSACGTLTGI PS HGGGKRFAVEQELVAASARAAVKDMDLQALHGR 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 15 - 1 . pep KVALY I ATMGDQGSGSLTGGRYS I DAL I RGEY INS P AVRTDYTYPRYETTAETTSGGLTG 

20 Ml II I MM I II II 11 MINIM MM MIMIIIMI II MM I MM I MINIM 

orf 15ng KVALY I ATMGDQGSGSLTGGRYS I DAL I RGEY INS PAVRTD YTYPRYETTAETTSGGLTG 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 15-1 .pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 

25 | | || | | | | || II I II I I I I I I I h I II II I II II II I II I I I I I I II I I I I II N II II I 

orf 15ng LTTSLSTLNAPALSRTQSDGSGSRSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 15-1 .pep FLRG I DWS PANADTDVF IN I DVFGT I RNRTEMHLYNAETLKAQTKLE YFAVDRTNKKLL 

30 1 1 1 II I 1 1 1 1 1 II I 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MM M I 1 1 1 1 1 1 II 1 1 1 1 

orflSng FLRG I DWS PANADTDVF INI DVFGT I RNRTEMHLYNAETLKAQTKLE YFAVDRTNKKLL 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 15-1 .pep IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIRPYGNHTGNSAPSVEADN 

35 || I I I I I I I II I I II I I I I I I I I I I II I I II II II II I N I I = II I I II I I II I I I I I I I 

orf 15ng IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIQPYGNHTGNSAPSVEADN 

250 260 270 280 290 300 

310 320 
orf 15-1. pep SHEGYGYSDEWRQHRQGQPX 

40 I II II I II I M 1 1 1 1 1 II M 

orf 1 5ng SHEGYGYSDEAVRQHRQGQPX 

310 320 

Computer analysis of these amino acid sequences reveals an ILSAC motif (putative membrane 
45 lipoprotein lipid attachment site, as predicted by the MOTIFS program). 
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indicates a putative leader sequence, and it was predicted that the proteins from N. meningitidis and 
N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
raising antibodies. 

PRF15-1 (SEP ID NO: 80) (31.7kDa) was cloned in pET and pGex vectors and expressed in 
ExolU as described above. The products of protein expression and purification were analyzed by 
SDS-PAGE. Figure 4A shows the results of affinity purification of the GST-fusion protein, and 
Figure 4B shows the results of expression of the His-fusion in E.coli. Purified GST-fusion protein 
was used to immunise mice, whose sera were used for Western blot (Figure 4C) and ELISA 
(positive result). These experiments confirm that ORFX-1 is a surface-exposed protein, and that it 
is a useful immunogen. 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 85>] (SEP ID 



1 . . GG . CAGCACA AAAAACAGGC GGTTgAACGG AAAAACCGTA TTTACGATGA 

51 TGCCGGGTAT GATATTCGGC GTATTCACGG GCGCATTCTC CGCAAAATAT 

101 ATCCCCGCGT TCGGGCTTCA AATTTTCTTC ATCCTGTTTT TAACCGCCGT 

151 CGCATTCAAA ACACTGCATA CCGACCCTCA GACGGCATCC CGCCCGCTGC 

201 CCGGACTGCC CrGACTGACT GCGGTTTCCA CACTGTTCGG CACAATGTCG 

251 AGCTGGGTCG GCATAGGCGG CGGTTCACTT TCCGTCCCCT TCTTAATCCA 

301 CTGCGGCTTC CCCGCCCATA AAGCCATCGG CACATCATCC GGCCTTGCCT 

351 GGCCGATTGC ACTCTCCGGC GCAATATCGT ATCTGCTCAA CGGCCTGAAT 

4 01 ATTGCAGGAT TGCCCGAAGG GTCACTGGGC TTCCTTTACC TGCCCGCCGT 

4 51 CGCCGTCCTC AGCGCGGCAA CCATTGCCTT TGCCCCGCTC GGTGTCAAAA 

501 CCGCCCACAA ACTTTCTTCT GCCAAACTCA AAAAATC.TT CGGCATTATG 

551 TTGCTTTTGA TTGCCGGAAA AATGCTGTAC AACCTGCTTT AA 



This corresponds to the amino acid sequence [<SEQ ID 86; GRF17>] (SEPIDNP: 86: GRF17) : 



Further work revealed the complete nucleotide sequence [<SEQ ID 87>] (SEP ID NG: 87) : 



1 ATGTGGCATT GGGACATTAT CTTAATCCTG CTTGCCGTAG GCAGTGCGGC 

51 AGGTTTTATT GCCGGCCTGT TCGGCGTAGG CGGCGGCACG CTGATTGTCC 

101 CTGTCGTTTT ATGGGTGCTT GATTTGCAGG GTTTGGCACA ACATCCTTAC 

151 GCGCAACACC TCGCCGTCGG CACATCCTTC GCCGTCATGG TCTTCACCGC 

2 01 CTTTTCCAGT ATGCTGGGGC AGCACAAAAA ACAGGCGGTC GACTGGAAAA 

251 CCGTATTTAC GATGATGCCG GGTATGATAT TCGGCGTATT CACGGGCGCA 



Example 11 



NP: 85) : 



51 
101 
151 



1 



. GQHKKQAVNG KTVFTMMPGM I FGVFTGAFS 
AFKTLHTDPQ TASRPLPGLP XLTAVSTLFG 
CGFPAHKAIG TSSGLAWPIA LSGAISYLLN 
AVLSAATIAF APLGVKTAHK LSSAKLKKSF 



AKYIPAFGLQ IFFILFLTAV 
TMSSWVGIGG GSLSVPFLIH 
GLNIAGLPEG SLGFLYLPAV 
GIMLLLIAGK MLYNLL* 
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301 CTCTCCGCAA AATATATCCC CGCGTTCGGG CTTCAAATTT TCTTCATCCT 

351 GTTTTTAACC GCCGTCGCAT TCAAAACACT GCATACCGAC CCTCAGACGG 

401 CATCCCGCCC GCTGCCCGGA CTGCCCGGAC TGACTGCGGT TTCCACACTG 

451 TTCGGCACAA TGTCGAGCTG GGTCGGCATA GGCGGCGGTT CACTTTCCGT 

501 CCCCTTCTTA ATCCACTGCG GCTTCCCCGC CCATAAAGCC ATCGGCACAT 

551 CATCCGGCCT TGCCTGGCCG ATTGCACTCT CCGGCGCAAT ATCGTATCTG 

601 CTCAACGGCC TGAATATTGC AGGATTGCCC GAAGGGTCAC TGGGCTTCCT 

651 TTACCTGCCC GCCGTCGCCG TCCTCAGCGC GGCAACCATT GCCTTTGCCC 

701 CGCTCGGTGT CAAAACCGCC CACAAACTTT CTTCTGCCAA ACTCAAAAAA 

751 Tc . TTCGGCA TTATGTTGCT TTTGATTGCC GGAAAAATGC TGTACAACCT 

801 GCTTTAA 

This corresponds to the amino acid sequence [<SEQ ID 88; ORF17-l>] (SEP ID NO: 88: ORF17- 
I): 



1 MWHWDIILIL LAVGSAAGFI AGLFGVGGGT LIVPWLWVL DLQGLAQHPY 

51 AQHLAVGTSF AVMVFTAFSS MLGQHKKQAV DWKTVFTMMP GMIFGVFTGA 

101 LSAKYIP AFG LQIFFILFLT AVAF KTLHTD PQTASRPLPG LPGLTAVSTL 

151 FGTMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL 

201 LNGLNIAGLP EGSLGFLYLP AVAVLSAATI AFAPLGV KTA HKLSSAKLKK 

251 XFGIMLLLIA GKMLYNLL* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with hypothetical H. influenzae transmembrane protein HI0902 (accession number 
P44070) (SEP ID NO: 1119) 



ORF17 fSEO ID NO: 86) and HI0902 proteins (SEP ID NO: 1119) show 28% aa identity in 192 
aa overlap: 



ORF17 3 HKKQAVNGKTVFTMMPGMI FGVFT - GAFSAKY I PAFGLQI F - - F I LFLTAVAFKTLHTDP 59 

HK + + V + P ++ VF G F + +IF +++L + + D 

HI0902 72 HKXGNIVWQAVRILAPVIMLSVFICGLFIGRLDREISAKIFACLVVYLATKMVLSIKKD- 130 

ORF17 60 QTASRPLPGLPXLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKAIGTSSGLAWPI 119 

Q ++ L L + L G SS GIGGG VPFL G +AIG+S+ + 

HI0902 131 QVTTKSLTPLSSVIG-GILIGMASSAAGIGGGGFIVPFLTARGINIKQAIGSSAFCGMLL 189 

ORF17 120 ALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVXXXXXXXXXXXXXX 179 

+SG S++++G +PE SLG++YLPAV ++A + + LG 

HI0902 190 GISGMFSFIVSGWGNPLMPEYSLGYIYLPAVLGITATSFFTSKLGASATAKLPVSTLKKG 249 

ORF17 180 FGIMLLLIAGKM 191 

F + L+ + +A M 
HI0902 250 FALFLIWAINM 261 

Homology with a predicted PRF from N. meningitidis (strain A) 



PRF17 (SEP ID NP: 86) shows 96.9% identity over a 196aa overlap with an PRF (PRF17a) 
(SEP ID NP: 90) from strain A of N. meningitidis: 
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10 20 30 

or f 1 7 . pep GQHKKQAVNGKT VFTMMPGMI FGVFTGA FS 

MINIM: | | | | I | | | | | : | | | 1 = I I : I 
or f 17a QG^QHPYAQHL AVGTSFAVMVFTAFSSMLG QHKKQAVDWK TVFTMMPGMVFGVFAGA LS 
5 50 60 70 80 90 100 

40 50 60 70 80 90 

orf 17 . pep AKYIP AFGLQIFFILFLTAVAF KTLHTDPQTASRPLPGLPXLTAVSTLFGTMSSWVGIGG 

I M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 N 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 17a AKYIP AFGLQIFFILFLTAVAF KTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGG 
10 110 120 130 140 150 160 

100 110 120 130 140 150 

orf 17 .pep GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAV 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 , 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

orf 17a GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAV 
15 170 180 190 200 210 220 

160 170 180 190 

orf 17 . pep AVLS AAT I AFAPLGVKTAHKLS SAKLKKS FGI MLLL I AGKMLYNLLX 

1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

or f 1 7 a AVLS AAT I AFAPLGVKTAHKLS SAKLKKS FG I MLLL I AGKMLYNLLX 

20 230 240 250 260 

The complete length ORF17a nucleotide sequence [<SEQ ID 89>] (SEP ID NO: 89) is: 

1 ATGTGGCATT GGGACATTAT CTTAATCCTG CTTGCCGTAG GCAGTGCGGC 

51 AGGTTTTATT GCCGGCCTGT TCGGCGTAGG CGGCGGCACG CTGATTGTCC 

25 101 CTGTCGTTTT ATGGGTGCTT GATTTGCAGG GTTTGGCACA ACATCCTTAC 

151 GCGCAACACC TCGCCGTCGG CACATCCTTC GCCGTCATGG TCTTCACCGC 

201 CTTTTCCAGT ATGCTGGGGC AGCACAAAAA ACAGGCGGTC GACTGGAAAA 

251 CCGTATTTAC GATGATGCCG GGTATGGTAT TCGGCGTATT CGCTGGCGCA 

3 01 CTCTCCGCAA AATATATCCC AGCGTTCGGG CTTCAAATTT TCTTCATCCT 
30 351 GTTTTTAACC GCCGTCGCAT TCAAAACACT GCATACCGAC CCTCAGACGG 

4 01 CATCCCGCCC GCTGCCCGGA CTGCCCGGAC TGACTGCGGT TTCCACACTG 
4 51 TTCGGCACAA TGTCGAGCTG GGTCGGCATA GGCGGCGGTT CACTTTCCGT 
501 CCCCTTCTTA ATCCACTGCG GCTTCCCCGC CCATAAAGCC ATCGGCACAT 
551 CATCCGGCCT TGCCTGGCCG ATTGCACTCT CCGGCGCAAT ATCGTATCTG 

35 601 CTCAACGGCC TGAATATTGC AGGATTGCCC GAAGGGTCAC TGGGCTTCCT 

651 TTACCTGCCC GCCGTCGCCG TCCTCAGCGC GGCAACCATT GCCTTTGCCC 

701 CGCTCGGTGT CAAAACCGCC CACAAACTTT CTTCTGCCAA ACTCAAAAAA 

751 TCCTTCGGCA TTATGTTGCT TTTGATTGCC GGAAAAATGC TGTACAACCT 

801 GCTTTAA 



40 



This encodes a protein having amino acid sequence [<SEQ ID 90>] (SEP ID NO: 90) : 



1 MWHWDIILIL LAVGSAAGFI AGLFGVGGGT LIVPWLWVL DLQGLAQHPY 

51 AQHLAVGTSF AVMVFTAFSS MLGQHKKQAV DWKTVFTMMP GMVFGVFAGA 

45 101 LSAKYIP AFG LQIFFILFLT AVAF KTLHTD PQTASRPLPG LPGLTAVSTL 

151 FGTMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL 

201 LNGLNIAGLP EGSLGFLYLP AVAVLSAATI AFAPLGV KTA HKLSSAKLKK 

251 SFGIMLLLIA GKMLYNLL* 



50 PRF17a (SEP ID NP: 90) and PRF17-1 (SEP ID NP: 88) show 98.9% identity in 268 aa overlap: 
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10 20 30 40 50 60 

orf 17a . pep MWHWDI ILILLAVGSAAGFIAGLFGVGGGTLIVPWLWVLDLQGLAQHPYAQHLAVGTSF 

1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 M 1 1 1 1 1 i M M 1 1 1 1 . IMI M M M II ,1 1 1 II I M M 

5 orf 1 7 - 1 MWHWDI ILILLAVGSAAGFIAGLFGVGGGTLI VPWLWVLDLQGLAQHPYAQHLAVGTSF 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 17a . pep AVMVFTAFSSMLGQHKKQAVDWKTVFTMMPGMVFGVFAGALSAKYIPAFGLQIFFILFLT 
I I II I I I I I I I I I I I I I I I I I I I . I I I I I I I h I I I I :| I II I I M I I I I II I I II I I I 
10 or f 1 7 - 1 AVMVFTAFS SMLGQHKKQAVDWKTVFTMMPGM I FGVFTGALS AKY I PAFGLQ IFFILFLT 

70 80 90 100 . 110 120 

130 140 150 160 170 180 

orf 17a. pep AVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II M 1 1 1 1 , 1 M 1 1 1 1 1 1 1 

1 5 orf 17-1 AVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKA 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 17a. pep IGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA 

IMIIIIIIMIIIIIIIM llllllllllll llllllllllllll llllllllll 

20 orf 17-1 IGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA 

190 200 210 220 230 240 

250 260 269 

orf 17a .pep HKLSSAKLKKS FGIMLLLIAGKMLYNLLX 

llllllllll IMIIIIMI IIIMI 
25 orf 17-1 HKLSSAKLKKX FGIMLLLIAGKMLYNLLX 

250 260 

Homology with a predicted ORF from N. gonorrhoeae 

ORF17 (SEP ID NO: 86) shows 93.9% identity over a 196aa overlap with a predicted ORF 
(ORF17.ng) (SEP ID NO: 92) from N. gonorrhoeae: 

30 orf 1 7 . pep GQHKKQAVNGKTVFTMMPGMI FGVFTGAFS 3 0 

MINIM: M : I M 1 1 1 1 1 1 1 1 h 1 1 M 

or f 1 7ng QGLAQHPYAQHLAVGTSFAVMVFTAFSSMLGQHKKQAVDWKTI FAMMPGMI FGVFAGALS 102 

orf 17 . pep AKYIPAFGLQIFFILFLTAVAFKTLHTDPQTASRPLPGLPXLTAVSTLFGTMSSWVGIGG 90 

MMMMMMMMMMMMMI M M I M M M 1 1 1 1 1 1 1 1 h 1 1 1 1 1 1 1 1 1 

35 orfl7ng AKYIPAFGLQIFFILFLTAVAFKTLHTGRQTASRPLPGLPGLTAVSTLFGAMSSWVGIGG 162 

orf 17 . pep GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAV 150 

I I I I I I M I I , I I I I I I I I I I . I I I I I I I I I I I I: I I I: II I I I M I I I I I M I I I I I 
orf 17ng GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLVNGLNIAGLPEGSLGFLYLPAV 202 

orf 17 . pep AVLSAATIAFAPLGVKTAHKLSSAKLKKSFGIMLLLIAGKMLYNLL 196 

40 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II h II M 1 1 1 1 1 1 1 1 1 1 1 1 ! I 

orfl7ng AVLSAATIAFAPLGVKTAHKLSSAKLKESFGIMLLLIAGKMLYNLL 268 

An PRF17ng nucleotide sequence [<SEQ ID 91>] (SEPIDNP: 91) is predicted to encode a 
protein having amino acid sequence [<SEQ ID 92>] (SEP ID NP: 92) : 
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1 MWHWDIILIL LAVGSAAGFI AGLFGVGGGT LIVPWLWVL DLQGLAQHPY 

51 AQHLAVGTSF AVMVFTAFSS MLGQHKKQAV DWKTIFAMMP GMIFGVFAGA 

101 LSAKYIPAFG LQIFFILFLT AVAFKTLHTG RQTASRPLPG LPGLTAVSTL 

151 FGAMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL 

201 VNGLNIAGLP EGSLGFLYLP AVAVLSAATI AFAPLGVKTA HKLSSAKLKE 

251 SFGIMLLLIA GKMLYNLL* 

Further work revealed the complete gonococcal DNA sequence [<SEQ ID 93>] (SEP ID NO: 93) : 



1 ATGTGGCATT GGGACATTAT CTTAATCCTG CTTGCcgtag gcAGTGCGGC 

51 AGGTTTTATT GCCGGCCTGT Tcggtgtagg cggcgGTACG CTGATTGTCC 

101 CTGTCGTTTT ATGGGTGCTT GATTTGCAGG GTTTGGCACA ACATCCTTAC 

151 GCGCAACACC TCGCCGTCGG CAcaTccttc gcCGTCATGG TCTTCACCGC 

201 CTTTTCCAGT ATGTTGGGGC AGCACAAAAA ACAGGCGGTC GACTGGAAAA 

251 CCATATTTGC GATGATGCCG GGTATGATAT TCGGCGTATT CGCTGGCGCA 

301 CTCTCCGCAA AATATATCCC CGCGTTCGGG CTTCAAATTT TCTTCATCCT 

351 GTTTTTAACC GCCGTCGCAT TCAAAACACT GCATACCGGT CGTCAGACGG 

4 01 CATCCCGCCC GCTGCCCGGG CTGCCCGGAC TGACTGCGGT TTCCACACTG 

451 TTCGGCGCAA TGTCGAGCTG GGTCGGCATA GGCGGCGGTT CACTTTCCGT 

501 CCCCTTCTTA ATCCACTGCG GCTTCCCCGC CCATAAAGCC ATCGGCACAT 

551 CATCCGGCCT TGCCTGGCCG ATTGCACTCT CCGGCGCAAT ATCGTATCTG 

601 GTCAACGGTC TGAATATTGC AGGATTGCCC GAAGGGTCGC TGGGCTTCCT 

651 TTACCTGCCC GCCGTCGCCG TCCTCAGCGC GGCAACCATT GCCTTTGCCC 

701 CGCTCGGTGT CAAAACCGCC CACAAACTTT CTTCTGCCAA ACTCAAAGAA 

751 TCCTTCGGCA TTATGTTGCT TTTGATTGCC GGAAAAATGC TGTACAACCT 

801 GCTTTAA 

This corresponds to the amino acid sequence [<SEQ ED 94; ORF17ng-l>] (SEP ID NO: 94; 
PRF17ng-l) : 



1 MWHWDIILIL LAVGSAAGFI AGLFGVGGGT LIVPWLWVL DLQGLAQHPY 

51 AQHLAVGTSF AVMVFTAFSS MLGQHKKQAV DWKTIFAMMP GMIFGVFAGA 

101 LSAKYIP AFG LQIFFILFLT AVAF KTLHTG RQTASRPLPG LPGLTAVSTL 

151 FGAMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL 

201 VNGLNIAGLP EGSLGFLYLP AVAVLSAATI AFAPLGV KTA HKLSSAKLKE 

251 SFGIMLLLIA GKMLYNLL* 

ORF17ng-l (SEP ID NO: 94) and PRF17-1 (SEP ID NO: 88) show 96.6% identity in 268 aa 
overlap: 

10 20 30 40 50 60 

MWHWD 1 1 L I LLAVGS AAGF I AGL FGVGGGTL I VPWLWVLDLQGLAQHPYAQHLAVGTS F 
I I I I I M II I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I M I I I I M II I I I II I II I 
MWHWD 1 1 L I LLAVGS AAGF I AGLFGVGGGTL I VP WLWVLDLQGLAQH P YAQHLAVGTS F 
10 20 30 40 50 60 

70 80 90 100 110 120 

AVMVFTAFSSMLGQHKKQAVDWKTVFTMM PGM I FGVFTGALSAKY I PAFGLQI FFI LFLT 

I I II I I I I I I I I ' I I I I I I I I I : I: I I I I I I I I I hi I ! I I I I I I I I I I ! I I I I I I I I 
AVMVFTAFSSMLGQHKKQAVbWKTI FAMMPGMI FGVFAGALSAKY I PAFGLQI FFI LFLT 
70 80 90 100 110 120 

130 140 150 160 170 180 

AVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGGGSLSVPFL IHCGFPAHKA 

MINIMI II llllllllllll'lllhlllllll llllllllll lllllll . 



orf 17-1 .pep 
orf 17ng-l 

orf 17-1 .pep 
orf 17ng-l 

orf 17-1 .pep 
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orf 17ng-l AVAFKTLHTGRQTASRPLPGLPGLTAVSTLFGAMSSWVGIGGGSLSVPFLIHCGFPAHKA 

130 140 150 160 170 180 

190 200 210 220 230 240 

or f 17 - 1 . pep IGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA 
I I II I I I I I I I I I I I M I h I I i M I I II ! I I I II I I I II I I I I I I I I I I I I I I I I II I 
orf 17ng-l IGTSSGLAWPIALSGAISYLVNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA 

190 200 210 220 230 240 



250 260 269 

orf 17-1 .pep HKLSSAKLKKXFGIMLLLIAGKMLYNLLX 

10 Illlllllh llllllllllllllllll 

orf 17ng-l HKLSSAKLKESFGIMLLLIAGKMLYNLLX 

250 260 

In addition, ORF17ng-l (SEP ID NO: 94) shows significant homology with a hypothetical 
1 5 H. influenzae protein (SEP ID NO: 1119) : 

sp|P44070|Y902_HAEIN HYPOTHETICAL PROTEIN HI0902 pir||G64015 hypothetical protein 
HI0902 - Haemophilus influenzae (strain Rd KW20) gi | 1573922 (U32772) H. influenzae 
predicted coding region HI0902 [Haemophilus influenzae] Length = 264 
Score = 74 (34.9 bits), Expect = 1.6e-23, Sum P(2) = 1.6e-23 
20 Identities = 15/43 (34%), Positives = 23/43 (53%) 

Query: 55 AVGTS FAVMVFTAFSSMLGQHKKQAVDWKTI FAMMPGMI FGVF 97 

A+GTSFA +V T S HK + W+ + + P ++ VF 
Sbjct: 52 ALGTS FAT I V I TG I GS AQRHHKLGN I VWQ AVR I LAP V I MLS VF 94 

Score = 195 (91.9 bits), Expect = 1.6e-23, Sum P(2) = 1.6e-23 
25 Identities = 44/114 (38%), Positives = 65/114 (57%) 

Query: 150 LFGAMSSWVGIGGGSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLVNGLNIAGL 209 

L G SS GIGGG VPFL G +AIG+S+ + +SG S + +V+G + 

Sbjct: 148 LIGMASSAAGIGGGGFIVPFLTARGINIKQAIGSSAFCGMLLGISGMFSFIVSGWGNPLM 207 

Query: 210 PEGSLGFLYLPAVAVLSAATIAFAPLGVKTAHKLSSAKLKESFGIMLLLIAGKM 263 
30 PE SLG++YLPAV ++A + + LG KL + LK+ F + L+++A M 

Sbjct: 208 PEYSLGYIYLPAVLGITATSFFTSKLGASATAKLPVSTLKKGFALFLIWAINM 261 



This analysis, including the homology with the hypothetical H.influenzae transmembrane protein, 
suggests that the proteins from N .meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

35 Example 12 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 95>] (SEP ID 
NO: 95) : 



1 . . GGAAACGGAT GGCAGGCAGA CCCCGAACAT CCGCTGCTCG GGCTTTTTGC 
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51 CGTCAGTAAT GTATCGATGA 

101 TGCATTATTG CTTTTCGGGA 

151 CTCAAACTTT ATGCGCTGAA 

201 GCTGATGGCG GTTGCCTATG 

251 CGTCAACGTT CGGCGGCTCG 

301 TTGATGCAGG TCTCGGTACT 

351 A 



CGCTTGCTTT TGTCGGAATA TGTGCGTTGG 
ACGGTTCAAG TGTTTGTGTT TGCGGCACTG 
GCCGGTTTAT TGGTTCGTGT TGCAGTTTGT 
TCCACCGCTG CGGTATAGAC CGGCAGCCGC 
CAGCTGCGAC TCGGCGGGTT GACGGCAGCG 
GGTGCTGCTG CTTTCAGAAA TTGGAAGATA 



This corresponds to the amino acid sequence [<SEQ ID 96; ORF18>] (SEP ID NO: 96; PRF18) : 



1 . . GNGWQADPEH PLLGLFAVSN VSMTLAFVGI CALVHYCFSG TVQVFVFAAL 
51 LKLYALKPVY WFVLQFVLMA VAYVHRCGID RQPPSTFGGS QLRLGGLTAA 
101 LMQVSVLVLL LSEIGR* 

Further work revealed the complete nucleotide sequence [<SEQ ID 97>] (SEP ID NO: 97) : 

1 ATGATTTTGC TGCATTTGGA TTTTTTGTCT GCCTTACTGT ATGCGGCGGT 
51 TTTTCTGTTT CTGATATTCC GCGCAGGAAT GTTGCAATGG TTTTGGGCGA 
101 GTATTATGCT GTGGCTGGGC ATATCGGTTT TGGGGGCAAA GCTGATGCCC 
151 GGCATATGGG GAATGACCCG CGCCGCGCCC TTGTTCATCC CCCATTTTTA 
201 CCTGACTTTG GGCAGCATAT TTTTTTTCAT CGGGCATTGG AACCGGAAAA 
251 CAGATGGAAA CGGATGGCAG GCAGACCCCG AACATCCGCT GCTCGGGCTT 
3 01 TTTGCCGTCA GTAATGTATC GATGACGCTT GCTTTTGTCG GAATATGTGC 

3 51 GTTGGTGCAT TATTGCTTTT CGGGAACGGT TCAAGTGTTT GTGTTTGCGG 

4 01 CACTGCTCAA ACTTTATGCG CTGAAGCCGG TTTATTGGTT CGTGTTGCAG 
4 51 TTTGTGCTGA TGGCGGTTGC CTATGTCCAC CGCTGCGGTA TAGACCGGCA 
501 GCCGCCGTCA ACGTTCGGCG GCTCGCAGCT GCGACTCGGC GGGTTGACGG 
551 CAGCGTTGAT GCAGGTCTCG GTACTGGTGC TGCTGCTTTC AGAAATTGGA 
601 AGATAA 

This corresponds to the amino acid sequence [<SEQ ID 98; PRF18-1>] (SEP ID NP: 98: PRF18- 
11: 

1 MILLHLDFLS ALLYAAVFLF LIFRAGMLQW FWASIMLWLG I S VLGAKLMP 
51 GIWGMTRAAP LFIPHFYLTL GSIFFFI GHW NRKTDGNGWQ ADPEHPLLGL 
101 FAVSNVSMTL AFVGICALVH YCFSGTVQVF VFAALLKLYA LKPVYWFVLQ 
151 FVLMAVAYVH RCGIDRQPPS TFGGSQLRLG GLTAALMQVS VLVLLLS EIG 
201 R* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted PRF from N. meningitidis (strain A) 

PRF18 (SEP ID NP: 96) shows 98.3% identity over a 116aa overlap with an PRF (PRF18a) 
(SEPIDNP: 100) from strain A of N. meningitidis: 



10 20 30 

orfl8 pep GNGWOADPEHPLLGLF AVSNVSMTLAFVGI 

I I I I I I I I . I I I II M I I I I I ' I I M I I I 
or f 1 8a TRAAP LFIPHFYLTLGSIFFFI GHWNRKTDGNGWQADPEHPLLGLFA VSNVSMTLAFVGI 
60 70 80 90 100 HO 
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10 



40 50 60 70 80 90 

orf 18 . pep CALVHYCFSGTVQVFVFAALLKXYAL KPVYWFVLQFVLMAVAYVH RCGIDRQPPSTFGGS 

TTTTl 1 1 1 1 MM II I II Ml II II I Mill II II II II II MINI! IIIIIMI III 

orf 18a CALVHYCFSXTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS 
120 130 140 150 160 170 

100 110 
orf 18. pep QLRLG GLTAALMQVS VLVLLLS E I GRX 

lllllllllllll MMII I III 
orfl8a QLRLG GLTAALMQXS VLVLLLS E I GRX 

180 190 200 

The complete length ORF18a nucleotide sequence [<SEQ ID 99>] (SEP ID NO: 99) is: 



15 



20 



25 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 



ATGATTTTGC 
TTTTCTGTTT 
GTATTATGCT 
GGCATATGGG 
CCTGACTTTG 
CGGATGGAAA 
TTTGCCGTCA 
GTTGGTGCAT 
CACTGCTCAA 
TTTGTGCTGA 
GCCGCCGTCA 
CAGCGTTGAT 
AGATAA 



TGCATTTGGA 
CTGATATTCC 
GTGGCTGGGC 
GAATGACCCG 
GGCAGCATAT 
CGGATGGCAG 
GTAATGTATC 
TATTGCTTTT 
ACTTTATGCG 
TGGCGGTTGC 
ACGTTCGGCG 
GCAGNTCTCG 



TTTTTTGTCT 
GCGCAGGAAT 
ATATCGGTTT 
CGCCGCGCCC 
TTTTTTTCAT 
GCAGACCCCG 
GATGACGCTT 
CGNGAACGGT 
CTGAAGCCGG 
CTATGTCCAC 
GNTCGCAGCT 
GTACTGGTGC 



GCCTTACTGT 
GTTGCAATGG 
TGGGGGCAAA 
TTGTTCATCC 
CGGGCATTGG 
AACATCCTCT 
GCTTTTGTCG 
TCAAGTGTTT 
TTTATTGGTT 
CGCTGCGGTA 
GCGACTCGGC 
TGCTGCTTTC 



ATGCGGCGGT 
TTTTGGGCGA 
GCTGATGCCC 
CCCATTTTTA 
AACCGGAAAA 
GCTCGGGCTG 
GAATATGTGC 
GTGTTTGCGG 
CGTGTTGCAG 
TAGACCGGCA 
GGGTTGACGG 
AGAAATTGGA 



This encodes a protein having amino acid sequence [<SEQ ID 100>] (SEP ID NO: 100) : 

1 MILLHLDFLS ALLYAAVFLF LIFRAGMLQW FWASIMLWLG ISVLGAKLMP 

51 GIWGMTRAA P LFIPHFYLTL GSIFFFI GHW NRKTDGNGWQ ADPEHPLLGL 

30 101 FAVSNVSMTL AFVGICALVH YCFSXTVQVF VFAALLKLYA LKPVYWFVLQ 

151 FVLMAVAYV H RCGIDRQPPS TFGGSQLRLG GLTAALMQXS VLVLLLS EIG 

201 R* 

ORF18a (SEP ID NO: 100) and PRF18-1 (SEP ID NP: 98) show 99.0% identity in 201 aa 
35 overlap: 



40 



10 20 30 40 50 60 

orf 18a . pep MILLHLDFLS ALL YAAVFLFL I FRAGMLQW FWASIMLWLG I SVLGAKLMPG I WGMTRAAP 

I II I I I I I II I I I I I I I I II M I I II I I I II I I I I II M I I M I I I I I II I I I I I I I II I 
orf 18-1 M I LLHLDFLSALLYAAVFLFL I FRAGMLQWFWASIMLWLGI SVLGAKLMPG I WGMTRAAP 

10 20 30 40 50 60 



45 



70 80 90 100 110 120 

orf 18a . pep LFI PHFYLTLGS I FFFIGHWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGICALVH 

I I I I I II I I I I I I I I I I I I I M I I I I II I II I I I: I I I I I I I I I I I I I II I I I I I I 
orf 18-1 LFI PHFYLTLGS I FFFIGHWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGI CALVH 

70 80 90 100 110 120 



50 



130 140 150 160 170 180 

or f 1 8a . pep YCFSXTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG 

MM I I II I II II I I I I I I II I II I I II I M II II I I I I II I I II I I I I I I I M M I II 
or f 1 8 - 1 YCFSGTVQVFVFAALLKLYALKP VYWFVLQFVLMAVAYVHRCG I DRQPPSTFGGSQLRLG 

130 140 150 160 170 180 
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190 200 
orf 18a . pep GLTAALMQXS VLVLLLSE I GRX 

Mill IIIIIIMIIIII 

orf 18-1 GLTAALMQVS VLVLLLS E IGRX 

5 190 200 

Homology with a predicted ORF from N gonorrhoeae 

PRF18 (SEP ID NO: 96) shows 93.1% identity over a 116aa overlap with a predicted ORF 
(ORF18.ng) (SEP ID NO: 102) from N. gonorrhoeae: 





orf 18 .pep 


GNGWQADPEHPLLGLFAVSNVSMTLAFVGI 


30 


10 


1 1 1 M 1 M 1 1 1 1 1 1 1 II 1 1 II 1 1 M 1 1 M 






orf 18ng 


TRAAPLFIPHFYLTLGSIFFFIGYWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGI 


115 




orf 18 .pep 


CALVHYCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS 

MIMMMMMMMI MIIIMI IMIMMMMIMM MIIMMIMM 


90 




orf 18ng 


CALVHYCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS 


175 


15 


orf 18 .pep 


QLRLGGLTAALMQVS VLVLLLSE IGR 116 

Mill hi MM ::||:|IM 
QLRLGVLAAMLMQVAVTAMLLAE IGR 201 






orf 18ng 





The complete length PRF18ng nucleotide sequence is [<SEQ ID 101>] (SEPIDNP: 101) : 



■ 1 ATGATTTTGC TGCATTTGGA TTTTTTGTCT GCCTTACTGt aTGCGGcggt 

51 tttTctgTTT CTGATATTCC GCGCAGGAAT GTTGCAATGG TTTTGGGCGA 

101 GTATTGCGTT GTGGCTCGGC ATCTCGGTTT TAGGGGTAAA GCTGATGCCG 

151 GGGATGTGGG GAATGACCCG CGCCGCGCCT TTGTTCATCC CCCATTTTTA 

201 CCTGACTTTG GGCAGCATAT TTTTTTTCAT CGGGTATTGG AACCGGAAAA 

251 CAGATGGAAA CGGATGGCAG GCAGACCCCG AACATCCGCT GCTCGGGCTT 

301 TTTGCCGTCA GTAATGTATC GATGACGCTT GCTTTTGTCG GAATATGTGC 

351 GTTGGTGCAT TATTGCTTTT CGGGAACGGT TCAAGTGTTT GTGTTTGCGG 

4 01 CATTGCTCAA ACTTTATGCG CTGAAGCCGG TTTATTGGTT CGTGTTGCAG 

4 51 TTTGTATTGA TGGCGGttgC CTATGTCCAC CGCTGCGGTA TAGACCGGCA 

501 GCCGCCGTCA ACGTTCGGCG GTTCGCAGCT GCGACTCGGC GTGTTGGCGG 

551 CGATGTTGAT GCAGGTTGCG GTAACGGCGA TGCTGCTTGC CGAAATCGGC 

601 AGATGA 

This encodes a protein having amino acid sequence [<SEQ ID 102>] (SEP ID NO: 102) : 



1 MILLHLDFLS ALLYAAVFLF LIFRAGMLQW FWAS I ALWLG ISVLGVKLMP 
51 GMWGMTRAAP LFIPHFYLTL GSIFFFI GYW NRKTDGNGWQ ADPEHPLLGL 
101 FAVSNVSMTL AFVGICALVH YCFSGTVQVF VFAALLKLYA LKPVYWFVLQ 
151 FVLMAVAYVH RCGIDRQPPS TFGGSQLRLG VLAAMLMQVA VTAMLLA EIG 
201 R* 

This PRF18ng fSEP ID NP: 102) protein sequence shows 94.0% identity in 201 aa overlap with 
PRF18-1 (SEP ID NP: 98) : 



10 20 30 40 50 60 

or f 1 8 - 1 . pep M I LLHLDFLS ALLYAAVFLFL I FRAGMLQWFWAS IMLWLGI SVLGAKLMPGIWGMTRAAP 



20 



25 



30 
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i I 1 1 II II 1 1 II 1 1 1 1 1 M II I : I J M 1 1 1 1 1 llllllllhllllhllllllll 

or f 1 8ng MI LLHLD FLS ALL YAAVFLFL I FRAGMLQWFWAS I ALWLG I S VLGVKLMPGMWGMTRAAP 

10 20 30 40 50 60 

70 80 90 100 110 120 

5 orf 18-1 .pep LFIPHFYLTLGSIFFFIGHWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGICALVH 

I ! M 1 1 1 1 1 1 1 1 1 1 II h 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 li I M 1 1 1 I -1 1 1 1 1 1 1 1 i 

orf 18ng LFIPHFYLTLGSIFFFIGYWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGICALVH 

70 80 90 100 110 120 

130 140 150 160 170 180 

1 0 or f 18 - 1 . pep YCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG 

M I I Ml II I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I II i I I ! II I I I I I I I II 
orf 18ng YCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG 

130 140 150 160 170 180 

190 200 
1 5 orf 18 - 1 . pep GLTAALMQVS VLVLLLS E I GRX 

hi lllhl ::|hlllll 
orf 1 8ng VLAAMLMQVAVTAMLLAEIGRX 

190 200 

Based on this analysis, including the presence of several putative transmembrane domains in the 
20 gonococcal protein, it is predicted that the proteins from ^meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 13 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 103>] (SEP ID 
NO: 103) : 

25 1 ATGAAAACCC CACTCCTCAA GCCTCTGCTN ATTACCTCGC TTCCCGTTTT 

51 CGCCAGTGTT TTTACCGCCG CCTCCATCGT CTGGCAGCTA GGCGAACCCA 
101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCCGGCGG CCTTGTCGAT 
151 TTGGACAACC NCNTGACCGG ACGGCTNAAA AACATCATCA CCACCGTCGC 
201 CCTGTTCACC CTCTCCTCGC TCACGGCACA AAGCACCCTC GGCACAGGGC 

30 251 TGCCCTTCAT CCTCGCCATG ACCCTGATGA CTT.CG.CTT CACCATTTTA 

301 GGCGCGGNCG . . . 

This corresponds to the amino acid sequence [<SEQ ID 104; ORF19>] fSEO ID NO: 104; 
ORF19) : 

35 1 MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGIIAGGLVD 

51 LDNXXTGRLK NIITTVALFT LSSLTAQSTL GTGLPFILAM TLMTXXFTIL 
101 GAX. . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 105>] (SEP ID NO: 105) : 



40 



1 ATGAAAACCC CACTCCTCAA GCCTCTGCTC ATTACCTCGC TTCCCGTTTT 
51 CGCCAGTGTT TTTACCGCCG CCTCCATCGT CTGGCAGCTA GGCGAACCCA 
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101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCCGGCGG CCTTGTCGAT 

151 TTGGACAACC GCCTGACCGG ACGGCTGAAA AACATCATCA CCACCGTCGC 

201 CCTGTTCACC CTCTCCTCGC TCACGGCACA AAGCACCCTC GGCACAGGGC 

251 TGCCCTTCAT CCTCGCCATG ACCCTGATGA CCTTCGGCTT CACCATTTTA 

301 GGCGCGGTCG GGCTCAAATA CCGCACCTTC GCCTTCGGTG CACTCGCCGT 

351 CGCCACCTAC ACCACACTTA CCTACACCCC CGAAACCTAC TGGCTGACCA 

401 ACCCCTTCAT GATTTTATGC GGCACCGTAC TGTACAGCAC CGCCATCCTC 

4 51 CTGTTCCAAA TCGTCCTGCC CCACCGCCCC GTCCAAGAAA GCGTCGCCAA 

501 CGCCTACGAC GCACTCGGCG GCTACCTCGA AGCCAAAGCC GACTTCTTCG 

551 ACCCCGATGA GGCAGCCTGG ATAGGCAACC GCCACATCGA CCTCGCCATG 

601 AGCAACACCG GCGTCATCAC CGCCTTCAAC CAATGCCGTT CCGCCCTGTT 

651 TTACCGCCTT CGCGGCAAAC ACCGCCACCC GCGCACCGCC AAAATGCTGC 

701 GTTACTACTT TGCCGCCCAA GACATACACG AACGCATCAG CTCCGCCCAC 

751 GTCGATTATC AGGAAATGTC CGAAAAATTC AAAAACACCG ACATCATCTT 

801 CCGCATCCAC CGCCTGCTCG AAATGCAGGG ACAAGCCTGC CGCAACACCG 

851 CCCAAGCCCT GCGCGCAAGC AAAGACTACG TTTACAGCAA ACGCCTCGGC 

901 CGCGCCATCG AAGGCTGCCG CCAATCGCTG CGCCTCCTTT CAGACAGCAA 

951 CGACAGTCCC GACATCCGCC ACCTGCGCCG CCTTCTCGAC AACCTCGGCA 

1001 GCGTCGACCA GCAGTTCCGC CAACTCCAGC ACAACGGCCT GCAGGCAGAA 

1051 AACGACCGCA TGGGCGACAC CCGCATCGCC GCCCTCGAAA CCAGCAGCCT 

1101 CAAAAACACC TGGCAGGCAA TCCGTCCGCA GCTAAACCTC GAATCAGGCG 

1151 TATTCCGCCA TGCCGTCCGC CTGTCCCTCG TCGTTGCCGC CGCCTGCACC 

1201 ATCGTCGAAG CCCTCAACCT CAACCTCGGC TACTGGATAC TACTGACCGC 

1251 CCTTTTCGTC TGCCAACCCA ACTACACCGC CACCAAAAGC CGCGTCCGCC 

1301 AGCGCATCGC CGGCACCGTA CTCGGCGTAA TCGTCGGCTC GCTCGTCCCC 

1351 TACTTCACCC CGTCTGTCGA AACCAAACTC TGGATTGTCA TCGCCAGTAC 

14 01 CACCCTCTTT TTCATGACCC GCACCTACAA ATACAGTTTC TCCACCTTCT 

14 51 TCATTACCAT TCAAGCCCTG ACCAGCCTCT CCCTCGCAGG TTTGGACGTA 

1501 TACGCCGCCA TGCCCGTACG CATCATCGAC ACCATTATCG GCGCATCCCT 

1551 TGCCTGGGCG GCAGTCAGCT ACCTGTGGCC AGACTGGAAA TACCTCACGC 

1601 TCGAACGCAC CGCCGCCCTT GCCGTATGCA GCAACGGTGC CTATCTCGAA 

1651 AAAATCACCG AACGCCTCAA AAGCGGCGAA ACCGGCGACG ACGTCGAATA 

1701 CCGCGCCACC CGCCGCCGCG CCCACGAACA CACCGCCGCC CTCAGCAGCA 

1751 CCCTTTCCGA CATGAGCAGC GAACCCGCAA AATTCGCCGA CAGCCTGCAA 

1801 CCCGGCTTTA CCCTGCTCAA AACCGGCTAC GCCCTGACCG GCTACATCTC 

1851 CGCCCTCGGC GCATACCGCA GCGAAATGCA CGAAGAATGC AGCCCCGACT 

1901 TTACCGCACA GTTCCACCTC GCCGCCGAAC ACACCGCCCA CATCTTCCAA 

1951 CACCTGCCCG AAACCGAACC CGACGACTTT CAGACAGCAC TGGATACACT 

2 001 GCGCGGCGAA CTCGACACCC TCCGCACCCA CAGCAGCGGA ACACAAAGCC 

2051 ACATCCTCCT CCAACAGCTC CAACTCATCG CCCGACAGCT CGAACCCTAC 

2101 TACCGCGCCT ACCGCCAAAT TCCGCACAGG CAGCCCCAAA ATGCAGCCTG 

2151 A 

This corresponds to the amino acid sequence [<SEQ ID 106; ORF19-l>] (SEP ID NO: 106; 
ORF19-1) : 



1 MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGI IAGGLVD 

51 LDNRLTGRLK NIITTVALFT LSSLTAQSTL GTGLPF ILAM TLMTFGFTIL 

101 GAV GLKYRTF AFGALAVATY TTLTYTPETY WLTNP FMILC GTVLYSTAIL 

151 LFQIVLPHRP VQESVANAYD ALGGYLEAKA DFFDPDEAAW IGNRHIDLAM 

201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH 

251 VDYQEMSEKF KNTDIIFRIH RLLEMQGQAC RNTAQALRAS KDYVYSKRLG 

301 RAIEGCRQSL RLLSDSNDSP DIRHLRRLLD NLGSVDQQFR QLQHNGLQAE 

351 NDRMGDTRIA ALETSSLKNT WQAIRPQLNL ESGVFRHAVR LSLWAAACT 

4 01 IVEALNLN LG YWILLTALFV CQPNYTATKS RVRQR IAGTV LGVIVGSLVP 

451 YFTPSVETKL WIVIASTTLF FMTRTYKYSF STFFITIQAL TSLSLAGLDV 

501 YAAMPVRIID TIIGASLAWA AVSYLWPDWK YLTLERTAAL AVCSNGAYLE 

551 KITERLKSGE TGDDVEYRAT RRRAHEHTAA LSSTLSDMSS EPAKFADSLQ 

601 PGFTLLKTGY ALTGYISALG AYRSEMHEEC SPDFTAQFHL AAEHTAHIFQ 
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orf 19 


6 


YHFK 


5 


orf 19 


66 


YHFK 


65 



651 HLPETEPDDF QTALDTLRGE LDTLRTHSSG TQSHILLQQL QLIARQLEPY 
701 YRAYRQIPHR QPQNAA* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with predicted transmenbrane protein YHFK of //. influenzae (accession number 
P44289) (SEP ID NO: 1120) 

ORF19 (SEP ID NO: 104) and YHFK proteins (SEP ID NO: 1120) show 45% aa identity in 97 aa 
overlap: 

LKPLLITSLPVFASVFTAASIWQLGEPKLAMPFVLGI IAGGLVDLDNXXTGRLKNI ITT 65 
L +I+++PVF +V AA +W +MP +LGI IAGGLVDLDN TGRLKN+ T 

LNAKVISTIPVFIAVNIAAVGIWFFDISSQSMPLILGIIAGGLVDLDNRLTGRLKNVFFT 64 

VALFTLS S LTAQS TLGTGL PF I LAMTLMTXX FT I LGA 102 
+ F++SS Q +G + +1+ MT++T FT++GA 
LIAFSISSFIVQLHIGKPIQYIVLMTVLTFIFTMIGA 101 

Homology with a predicted PRF from N. meningitidis (strain A) 

PRF19 (SEP ID NP: 104) shows 92.2% identity over a 102aa overlap with an PRF (PRF19a) 
(SEP ID NP: 108) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 19 .pep MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNXXTGRLK 

I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 1 1 Mill 
orf 19a MKTPPLKPLLITSLPVFASVFTAAS I VWQLGEPKLAMPFVLGI I AGGLVDLDNRLTGRLK 

10 20 30 40 50 60 

70 80 90 100 

orf 19 .pep Nil TTVAL FTLS S LTAQSTLGTGLP F I LAMTLMTXXFT I LGAX 

II: IIIIIIIIMMIIIIIIIIi Mill llhll 
orf 19a NI IATVALFTLSSLVAQSTLGTGLPF ILAMTLMTFGFTIMGAV GLKYRTFAFGAliAVATY • 

70 80 90 100 110 120 

orf 19a TTLTYTPETYWLTNP FMILCGTVLYSTAI ILF QI ILPHRPVQENVANAYEALGSYLEAKA 

130 140 150 160 170 180 



The complete length PRF19a nucleotide sequence [<SEQ ID 107>] (SEPIDNP: 107) is: 

1 ATGAAAACCC CACCCCTCAA GCCTCTGCTC ATTACCTCGC TTCCCGTTTT 

51 CGCCAGTGTC TTTACCGCCG CCTCCATCGT CTGGCAGCTG GGCGAACCCA 

101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCTGGCGG CCTGGTCGAT 

151 TTGGACAACC GCCTGACCGG ACGGCTGAAA AACATCATCG CCACCGTCGC 

201 CCTGTTCACC CTCTCCTCAC TTGTCGCGCA AAGCACCCTC GGCACAGGTT 

251 TGCCATTCAT CCTCGCCATG ACCCTGATGA CTTTCGGCTT TACCATCATG 

301 GGCGCGGTCG GGCTGAAATA CCGCACCTTC GCCTTCGGCG CACTCGCCGT 

351 CGCCACCTAC ACCACACTTA CCTACACCCC CGAAACCTAC TGGCTGACCA 

401 ACCCCTTTAT GATTCTGTGC GGAACCGTAC TGTACAGCAC CGCCATCATC 
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4 51 CTGTTCCAAA TCATCCTGCC CCACCGCCCC .GTTCAAGAAA ACGTCGCCAA 

501 CGCCTACGAA GCACTCGGCA GCTACCTCGA AGCCAAAGCC GACTTTTTCG 

551 ATCCCGACGA AGCCGAATGG ATAGGCAACC GCCACATCGA CCTCGCCATG 

601 AGCAACACCG GCGTCATCAC CGCCTTCAAC CAATGCCGTT CCGCCCTGTT 

5 651 TTACCGCCTT CGCGGCAAAC ACCGCCACCC GCGCACCGCC AAAATGCTGC 

701 GCTACTACTT CGCCGCCCAA GACATACACG AACGCATCAG CTCCGCCCAC 

751 GTCGACTACC AAGAGATGTC CGAAAAATTC AAAAACACCG ACATCATCTT 

801 CCGCATCCAC CGCCTGCTCG AAATGCAGGG ACAAGCCTGC CGCAACACCG 

851 CCCAAGCCCT GCGCGCAAGC AAAGACTACG TTTACAGCAA ACGCCTCGGC 

10 901 CGCGCCATCG AAGGCTGCCG CCAATCGCTG CGCCTCCTTT CAGACAGCAA 

951 CGACAATCCC GACATCCGCC ACCTGCGCCG CCTTCTCGAC AACCTCGGCA 

1001 GCGTCGACCA GCAGTTCCGC CAACTCCAGC ACAACGGCCT GCAGGCAGAA 

1051 AACGACCGCA TGGGCGACAC CCGCATCGCC GCCCTCGAAA CCGGCAGCCT 

1101 CAAAAACACC TGGCAGGCAA TCCGTCCGCA GCTAAACCTC GAATCAGGCG 

15 1151 TATTCCGCCA TGCCGTCCGC CTGTCCCTTG TCGTTGCCGC CGCCTGCACC 

12 01 ATCGTCGAAG CCCTCAACCT CAACCTCGGC TACTGGATAC TACTGACCGC 

12 51 CCTTTTCGTC TGCCAACCCA ACTACACCGC CACCAAAAGC CGCGTCCGCC 

13 01 AGCGCATCGC CGGCACCGTA CTCGGCGTAA TCGTCGGCTC GCTCGTCCCC 
1351 TACTTTACCC CCTCCGTCGA AACCAAACTC TGGATCGTCA TCGCCAGTAC 

20 14 01 CACCCTCTTT TTCATGACCC GCACCTACAA ATACAGCTTC TCGACATTTT 

14 51 TCATCACCAT TCAAGCCCTG ACCAGCCTCT CCCTCGCAGG GTTGGACGTA 
1501 TACGCCGCCA TGCCCGTACG CATCATCGAC ACCATTATCG GCGCATCCCT 
1551 TGCCTGGGCG GCAGTCAGCT ACCTGTGGCC AGACTGGAAA TACCTCACGC 
1601 TCGAACGCAC CGCCGCCCTT GCCGTATGCA GCAACGGCGC CTATCTCGAA 

25 1651 AAAATCACCG AACGCCTCAA AAGCGGCGAA ACCGGCGACG ACGTCGAATA 

1701 CCGCGCCACC CGCCGCCGCG CCCACGAACA CACCGCCGCC CTCAGCAGCA 

1751 CCCTTTCCGA CATGAGCAGC GAACCCGCAA AATTCGCCGA CAGCCTGCAA 

1801 CCCGGCTTTA CCCTGCTCAA AACCGGCTAC GCCCTGACCG GCTACATCTC 

1851 CGCCCTCGGC GCATACCGCA GCGAAATGCA CGAAGAATGC AGCCCCGACT 

30 1901 TTACCGCACA GTTCCACCTC GCCGCCGAAC ACACCGCCCA CATCTTCCAA 

1951 CACCTGCCCG AAACCGAACC CGACGACTTT CAGACAGCAC TGGATACACT 

2001 GCGCGGCGAA CTCGACACCC TCCGCACCCA CAGCAGCGGA ACACAAAGCC 

2 051 ACATCCTCCT CCAACAGCTC CAACTCATCG CCCGGCAGCT CGAACCCTAC 

2101 TACCGCGCCT ACCGACAAAT TCCGCACAGG CAGCCCCAAA ACGCAGCCTG 

35 2151 A 

This encodes a protein having amino acid sequence [<SEQ ID 108>] (SEP ID NO: 108) : 

1 MKTPPLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGIIAGGLVD 

51 LDNRLTGRLK NIIATVALFT LSSLVAQSTL GTGLPF ILAM TLMTFGFTIM 

40 101 GAVGLKYRTF AFGALAVATY TTLTYTPETY WLTNP FMILC GTVLYSTAII 

151 LFQIILPHRP VQENVANAYE ALGSYLEAKA DFFDPDEAEW IGNRHIDLAM 

201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH 

251 VDYQEMSEKF KNTDIIFRIH RLLEMQGQAC RNTAQALRAS KDYVYSKRLG 

301 RAIEGCRQSL RLLSDSNDNP DIRHLRRLLD NLGSVDQQFR QLQHNGLQAE 

45 351 NDRMGDTRIA ALETGSLKNT WQAIRPQLNL ESGVFRHAVR LSLWAAACT 

4 01 IVEALNL NLG YWILLTALFV CQ PNYTATKS RVRQR IAGTV LGVIVGSLVP 

4 51 YFTPSVETKL WIVIASTTLF FMTRTYKYSF STFFITIQAL TSLSLAGLDV 

501 YAAMPVRI ID TIIGASLAWA AVSYLWPDWK YLTLERTAAL AVCSNGAYLE 

551 KITERLKSGE TGDDVEYRAT RRRAHEHTAA LSSTLSDMSS EPAKFADSLQ 

50 .601 PGFTLLKTGY ALTGYISALG AYRSEMHEEC SPDFTAQFHL AAEHTAHIFQ 

651 HLPETEPDDF QTALDTLRGE LDTLRTHSSG TQSHILLQQL QLIARQLEPY 

701 YRAYRQIPHR QPQNAA* 

ORF19a (SEP ID NO: 108) and ORF19-1 (SEP ID NO: 106) show 98.3% identity in 716 aa 



55 overlap: 
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10 20 30 40 50 60 

orf 19a. pep MKTPPLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK 

1 1 1 1 i 1 1 1 1 1 1 1 : 1 II I i 1 1 1 Ml 1 , 1 1 1 Ml 1 1 1 1 1 1 , M 1 1 II 1 1 1 II 1 1 1 1 1 1 M 

5 or f 1 9 - 1 MKTPLLKPLL I TS LPVFAS VFTAAS I VWQLGEPKLAMPFVLGI IAGGLVDLDNRLTGRLK 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 19a. pep NIIATVALFTLSSLVAQSTLGTGLPFILAMTLMTFGFTIMGAVGLKYRTFAFGALAVATY 
Ml M I I I I I I Ml i I Ml II M II I II I I I I I I I M I I I I I I I I I I I I I I I I I I I I 
10 orf 19-1 NIITTVALFTLSSLTAQSTLGTGLPFILAMTLMTFGFTILGAVGLKYRTFAFGALAVATY 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 1 9a . pep TTLTYTPETYWLTNPFMILCGTVLYSTAI ILFQI ILPHRPVQENVANAYEALGSYLEAKA 

M I M 1 1 II 1 1 1 1 1 M II 1 1 II M 1 1 1 1 1 M M Ml 1 1 M 1 1 Ml II I Ml I M 1 1 1 1 1 

15 orf 19-1 TTLTYTPETYWLTNPFMILCGTVLYSTAILLFQIVLPHRPVQESVANAYDALGGYLEAKA 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 19a . pep DFFDPDEAEWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ 

MINIM 1 1 1 1 1 1 1 1 1 1 1 1 1 MM M 1 1 1 1 1 1 M M 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 II M 

20 orf 19-1 DFFDPDEAAWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 19a. pep D I HERI S S AHVDYQEMSEKFKNTD 1 1 FR IHRLLEMQGQACRNTAQALRASKDYVYS KRLG 

1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 M 1 1 1 II 1 1 

25 orf 19-1 D I HER I S S AHVDYQEMSEKFKNTD 1 1 FRI HRLLEMQGQACRNTAQALRAS KDYVYS KRLG 

250 260 270 280 290 300 

310 320 330 340 350 360 

orf 19a . pep RAIEGCRQSLRLLSDSNDNPDIRHLRRLLDNLGSVDQQFRQLQHNGLQAENDRMGDTRIA 

1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 MM 1 1 1 1 M 1 1 1 1 1 1 II I II 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 

30 orf 19-1 RAIEGCRQSLRLLSDSNDSPDIRHLRRLLDNLGSVDQQFRQLQHNGLQAENDRMGDTRIA 

310 320 330 340 350 360 

370 380 390 400 410 420 

orf 19a . pep ALETGSLKNTWQAIRPQLNLESGVFRHAVRLSLWAAACTIVEALNLNLGYWILLTALFV 

I II Ml 1 1 M I M M 1 1 1 1 1 1 M I M 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 Ml I II II M I II 1 1 

35 orf 19- 1 ALETSSLKNTWQAIRPQLNLESGVFRHAWLSLWAAACTIVEALNLNLGYWILLTALFV 

370 380 390 400 410 420 

430 440 450 460 470 480 

orf 19a . pep CQPNYTATKSRVRQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIASTTLFFMTRTYKYSF 

1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i I M 1 1 1 1 1 1 1 1 1 

40 orf 19-1 CQPNYTATKSRVRQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIASTTLFFMTRTYKYSF 

430 440 450 460 470 480 

490 500 510 520 530 540 

orf 19a. pep STFFITIQALTSLSLAGLDVYAAMPVRI IDTI IGASLAWAAVSYLWPDWKYLTLERTAAL 

1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II M M I II 1 1 1 1 II I 

45 orf 19-1 STFFITIQALTSLSLAGLDVYAAMPVRI IDTI IGASLAWAAVSYLWPDWKYLTLERTAAL 

490 500 510 520 530 540 



orf 19a .pep 



550 560 570 580 590 600 

AVCSNGAYLEKITERLKSGETGDDVEYRATRRRAHEHTAALSSTLSDMSSEPAKFADSLQ 
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1 1 1 1 1 1 M 1 1 M 1 1 1 1 M 1 1 1 1 il II 1 1 1 1 1 1 1 1 1 1 1 1 1 ! II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 

or f 1 9 - 1 AVCSNGAYLEKITERLKSGETGDDVEYRATRRRAHEHTAALSSTLSDMSSEPAKFADSLQ 

550 560 570 580 590 600 

610 620 630 640 650 660 

orf 19a . pep PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPETEPDDF 

1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 19-1 PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPETEPDDF 

610 620 630 640 650 660 

670 680 690 700 710 

orf 19a . pep QTALDTLRGELDTLRTHSSGTQSH I LLQQLQL I ARQLEP YYRAYRQ I PHRQPQNAAX 

I I Ml I I I I I I I I I I I I I II 1 I II I I I I I I I I I I I I I II I I I I M I I I I I I I M I 
O r f 1 9 - 1 QTALDTLRGELDTLRTHS SGTQSH I LLQQLQL I ARQLE P YYRAYRQ I PHRQPQNAAX 

670 680 690 700 710 

Homology with a predicted ORF from N. gonorrhoeae 

ORF19 (SEP ID NO: 104) shows 95.1% identity over a 102aa overlap with a predicted ORF 
(ORF19.ng) (SEP ID NO: 110) from N. gonorrhoeae: 

orf 19 . pep MKTPLLKPLLITSLPVFASVFTAAS I VWQLGEPKLAMPFVLGI I AGGLVDLDNXXTGRLK 60 

llllllllllllll llllllll IIIIIIIIMIIIIIII Mllllll I Mill 

orf 19ng MKTPLLKPLLITSLPVFASVFTAAS I VWQLGEPKLAMPFVLGI IAGGLVDLDNRLTGRLK 60 

orf 19 .pep Nil TTVALFTLS S LTAQSTLGTGLP F I LAMTLMTXXFT I LGAX 103 

Ihl llllllll llllllli llllllll I IIIIM 

orf 19ng NIIATVALFTLSSLTAQSTLGTGLPFI LAMTLMTFGFT I LGAVGLKYRTFAFGALAVATY 120 

An ORF19ng nucleotide sequence [<SEQ ID 109>] (SEP ID NO: 109) is predicted to encode a 
protein having amino acid sequence [<SEQ ID 1 10>] (SEPIDNP: 110) : 

1 MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGI IAGGLVD 

51 LDNRLTGRLK NIIATVA LFT LSSLTAQSTL GTGLPFILAM TLMTFGFTIL 

101 GAVGLKYRTF AFGALAVAT Y TTLTYTPETY WLTNPFM ILC GTVLYSTAI I 

151 LFQIILPHRP VQESVAN AYE ALGGYLEAKA DFFDP DEAAW IGNRHIDLAM 

201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH 

251 VDYQEMSEKF KNTDI IFRIR RLLEMQGQAC RNTAQAIRSG KDYVYSKRLG 

301 RAIEGCRQSL RLLSDGNDSP DIRHLSRLLD NLGSVDQQFR QLRHSDSPAE 

351 NDRMGDTRIA ALETGSFKNT * 

Further work revealed the complete nucleotide sequence [<SEQ ID 1 1 1>] (SEP ID NP: 111) : 

1 ATGAAAACCC CACTCCTCAA GCCTCTGCTC ATTACCTCGC TTCCCGTTTT 

51 CGCCAGTGTC TTTACCGCCG CCTCCATCGT CTGGCAGCTA GGCGAACCCA 

101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCCGGCGG CCTGGTCGAT 

151 TTGGACAACC GCCTGACCGG ACGGCTGAAA AACATCATCG CCACCGTCGC 

201 CCTGTTTACC CTCTCCTCGC TCACGGCGCA AAGCACCCTC GGCACAGGGC 

251 TGCCCTTCAT CCTCGCCATG ACCCTGATGA CCTTCGGCTT TACCATTTTA 

301 GGCGCGGTCG GGCTGAAATA CCGCACCTTC GCCTTCGGCG CACTCGCCGT 

351 CGCCACCTAC ACCACGCTTA CCTACACCCC CGAAACCTAC TGGCTGACCA 

401 ACCCCTTCAT GATTTTATGC GGCACCGTAC TGTACAGCAC CGCCATCATC 

451 CTGTTCCAAA TCATCCTGCC CCACCGCCCC GTCCAAGAAA GCGTCGCCAA 

501 TGCCTACGAA GCACTCGGCG GCTACCTCGA AGCCAAAGCC GACTTCTTCG 
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551 ACCCCGATGA GGCAGCCTGG ATAGGCAACC GCCACATCGA CCTCGCCATG 

601 AGCAACACCG GCGTCATCAC CGCCTTCAAC CAATGCCGTT CCGCCCTGTT 

651 TTACCGTTTG CGCGGCAAAC ACCGCCACCC GCGCACCGCC AAAATGCTGC 

701 GCTACTACTT CGCCGCCCAA GACATCCACG AACGCATCAG CTCCGCCCAC 

751 GTCGACTACC AAGAGATGTC CGAAAAATTC AAAAACACCG ACATCATCTT 

801 CCGCATCCGC CGCCTGCTCG AAATGCAGGG GCAGGCGTGC CGCAACACCG 

851 CCCAAGCCAT CCGGTCGGGC AAAGACTAcg tTTACAGCAA ACGCCTCGGA 

901 CGCGCCATcg aaggctgCCG CCAGTCGCtg cgcctCCTTt cagacggcaA 

951 CGACAGTCCC GACATCCGCC ACCTGAGccg CCTTCTCGAC AACCTCGgca 

1001 GCGTcgacca gcagtTCcgc caactCCGAC ACAgcgactC CCCCGCcgaa 

1051 Aacgaccgca tgggcgacaC CCGCATCGCC GCCCtcgaaa ccggcagctT 

1101 caaaaaCAcc tggcaggCAA TCCGTCCGCa gctgaaCCTC GAATCatgCG 

1151 TATTCCGCCA TGCCGTCCGC CTGTCCCTCG TCGTTGCCGC CGCCTGCACC 

1201 ATCGTCgaag cCCTCAACCT CAACCTCGGC TACTGGATAC TGCTGACCGC 

1251 CCTTTTCGTC TGCCAACCCA ACTACACCGC CACCAAAAGC CGCGTGTACC 

1301 AACGCATCGC CGGCACCGTA CTCGGCGTAA TCGTCGGCTC GCTCGTCCCC 

13 51 TACTTCACCC CCTCCGTCGA AACCAAACTC TGGATTGTCA TCGCCGGTAC 

14 01 CACCCTGTTC TTCATGACCC GCACCTACAA ATACAGTTTC TCCACCTTCT 
14 51 TCATCACCAT TCAGGCACTG ACCAGCCTCT CCCTCGCAGG TTTGGACGTA 
1501 TACGCCGCCA TGCCCGTGCG CATCATcgaC ACCATTATCG GCGCATCCCT 
1551 TGCCTGGGCG GCGGTCAGCT ACCTGTGGCC AGACTGGAAA TACCTCACGC 
1601 TCGAACGCAC CGCCGCCCTT GCCGTATGCA GCAGCGGCAC ATACCTCCAA 
1651 AAAATTGCCG AACGCCTCAA AACCGGCGAA ACCGGCGACG ACATAGAATA 
1701 CCGCATCACC CGCCGCCGCG CCCACGAACA CACCGCCGCC CTCAGCAGCA 
1751 CCCTTTCCGA CATGAGCAGC GAACCCGCAA AATTCGCCGA CAGCCTGCAA 
1801 CCCGGCTTTA CCCTGCTCAA AACCGGCTAC GCCCTGACCG GCTACATCTC 
1851 CGCCCTCGGC GCATACCGCA GCGAAATGCA CGAAGAATGC AGCCCCGACT 
1901 TTACCGCACA GTTCCACCTT GCCGCCGAAC ACACCGCCCA CATCTTCCAA 
1951 CACCTGCCCG ACATGGGACC CGACGACTTT CAGACGGCAT TGGATACACT 
2001 GCGCGGCGAA CTCGGCACCC TCCGCACCCG CAGCAGCGGA ACACAAAGCC 
2051 ACATCCTCCT CCAACAGCTC CAACTCATCG CccgGCAACT CGAACCCTAC 
2101 TACCGCGCCT ACCGACAAAT TCCGCACAGG CAGCCCCAAA ACGCAGCCTG 

, 2151 A 

This corresponds to the amino acid sequence [<SEQ ID 112; PRF19ng-l>] (SEP ID NO: 112; 
PRF19ng-l) : 



1 MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGIIAGGLVD 

51 LDNRLTGRLK NI I ATVALFT LSSLTAQSTL GTGLPF ILAM TLMTFGFTIL 

101 GAV GLKYRTF AFGALAVATY TTLTYTPETY WLTNP FMILC GTVLYSTAI I 

151 ' LFQIILPHRP VQESVANAYE ALGGYLEAKA DFFDPDEAAW IGNRHIDLAM 

201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH 

251 VDYQEMSEKF KNTDIIFRIR RLLEMQGQAC RNTAQAIRSG KDYVYSKRLG 

301 RAIEGCRQSL RLLSDGNDSP DIRHLSRLLD NLGSVDQQFR QLRHSDSPAE 

351 NDRMGDTRIA ALETGSFKNT WQAIRPQLNL ESCVFRHAVR LSLWAAACT 

4 01 IVEALNLN LG YWILLTALFV CQPNYTATKS RVYQR IAGTV LGVIVGSLVP 

451 YFTPSVETKL WIVIAGTTLF FMTRTYKYSF STFFITIQAL TSLSLAGLDV 

501 YAAMPVRI ID TIIGASLAWA AVSYLWPDWK YLTLERTAAL AVCSSGTYLQ 

551 KIAERLKTGE TGDDIEYRIT RRRAHEHTAA LSSTLSDMSS EPAKFADSLQ 

601 PGFTLLKTGY ALTGYISALG AYRSEMHEEC SPDFTAQFHL AAEHTAHIFQ 

651 HLPDMGPDDF QTALDTLRGE LGTLRTRSSG TQSHILLQQL QLIARQLEPY 

701 YRAYRQIPHR QPQNAA* 



ORF19ng-l (SEP ID NO: 112) and ORF19-1 (SEP ID NO: 106) show 95.5% identity in 716 aa 
overlap: 
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10 20 30 40 50 60 

or f 1 9 - 1 . pep MKTPLLKPLLITSLPVFASVFTAASI VWQLGEPKLAMPFVLGI IAGGLVDLDNRLTGRLK 

I I I I I I I M I I I I I I I I I I I I M I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
orf 19ng-l MKTPLLKPLLITSLPVFASVFTAASI VWQLGEPKLAMPFVLGI IAGGLVDLDNRLTGRLK 

10 20 30 40 50 60 



orf 19-1 .pep 
10 orfl9ng-l 



70 80 90 100 110 120 

NIITTVALFTLSSLTAQSTLGTGLPFILAMTLMTFGFTILGAVGLKYRTFAFGALAVATY 

llhllllll IIIIIIIMIII IIIIIIMIill IIIIMMIIIhllllllllll 

N 1 1 ATVALFTLS S LTAQS TLGTGL P F I LAMTLMTFGFT I LGAVGLKYRTFAFGALAVAT Y 
70 80 90 100 110 120 



orf 19-1 .pep 
15 orfl9ng-l 



130 140 150 160 170 180 

TTLTYTPETYWLTNPFMILCGTVLYSTAILLFQIVLPHRPVQESVANAYDALGGYLEAKA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I M I hi I MUI I M 1 1 1 1 1 M I hi 1 1 1 1 M M 

TTLTYTPETYWLTNPFMILCGTVLYSTAI ILFQI ILPHRPVQESVANAYEALGGYLEAKA 
130 140 150 160 170 180 



orf 19-1 .pep 
20 orfl9ng-l 



190 200 210 220 230 240 

DFFDPDEAAWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ 

I i [ [ill 1 i i i r ! 1 1 1 1 1 1 1 1 1 ! i 1 1 1 1 1 1 1 1 1 

DFFDPDEAAWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ 
190 200 210 220 230 240 



orf 19-1 .pep 
25 orfl9ng-l 



250 260 270 280 290 300 

D I HER I S S AHVD YQEMS E KFKNTD 1 1 FR I HRLLEMQGQACRNTAQALRAS KD YVYS KRLG 

I I I I M I M I I I I I 1 1 I I I I t I I ! I I I 1 : I M !. I I I I I I I I I I I : : : I I 1 1 M 1 I I 1 
DIHERISSAHVDYQEMSEKFKNTDIIFRIRRLLEMQGQACRNTAQAIRSGKDYVYSKRLG 
250 260 270 280 290 300 



orf 19-1 .pep 
30 orfl9ng-l 



310 320 330 340 350 360 

RA I EGCRQS LRLLSDSNDS PD I RHLRRLLDNLGS VDQQFRQLQHNGLQAENDRMGDTR I A 

lllhllllllllhlllll III lllllli hllllhh IIIIIIIMIII 
RAIEGCRQSLRLLSDGNDSPDIRHLSRLLDNLGSVDQQFRQLRHSDSPAENDRMGDTRIA 

310 320 330 340 350 360 



orf 19-1. pep 
35 orfl9ng-l 



370 380 390 400 410 420 

ALETSSLKNTWQAIRPQLNLESGVFRHAVRLSLWAAACTIVEALNLNLGYWILLTALFV 

II I M :|l MINIM illl I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 

ALETGSFKNTWQAIRPQLNLESCVFRHAVRLSLWAAACTIVEALNLNLGYWILLTALFV 
370 380 390 400 410 420 



orf 19-1 .pep 
40 orfl9ng-l 



430 440 , 450 460 470 480 

CQPNYTATKSRVRQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIASTTLFFMTRTYKYSF 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CQPNYTATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSF 
430 440 450 460 470 480 



orf 19-1 .pep 
45 orfl9ng-l 



490 500 510 520 530 540 

STFFITIQALTSLSLAGLDVYAAMPVRI IDTI IGASLAWAAVSYLWPDWKYLTLERTAAL 

I I I I I I I II I I I I I I I I M I I I I M I M II I I I I I I I I I I I I I I I I I II I I I M I I I I I I 
STFFITIQALTSLSLAGLDVYAAMPVRI IDTI IGASLAWAAVSYLWPDWKYLTLERTAAL 

490 500 510 520 530 540 



orf 19-1 . pep 



550 560 570 580 590 600 

AVCSNGAYLEKITERLKSGETGDDVEYRATRRRAHEHTAALSSTLSDMSSEPAKFADSLQ 
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Illhhihihillhlllllhlll 1 1 1 1 1 1 M M 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 II I 

orf 19ng-l AVCSSGTYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFADSLQ 

550 560 570 580 590 600 

610 620 630 640 650 660 

5 orf 19-1. pep PGFTLLKTGYALTGY I S ALGAYRS EMHEECS PDFTAQFHLAAEHTAH I FQHLPETE PDDF 

Mill I I Ml MM MIMIMIIMIIIIIIIIMIIIIIMIIh 1 1 1 1 

orf 19ng-l PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPDMGPDDF 

610 620 630 640 650 660 

670 680 690 700 710 

10 orf 19-1. pep QTALDTLRGELDTLRTHSSGTQSHILLQQLQLIARQLEPYYRAYRQIPHRQPQNAAX 

IIIIIMIIII I M M I M 1 1 1 1 Ml M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M I 

orf 19ng-l QTALDTLRGELGTLRTRSSGTQSHILLQQLQLIARQLEPYYRAYRQIPHRQPQNAAX 

670 680 690 700 710 

15 In addition, ORF19ng-l (SEP ID NO: 112) shows significant homology to a hypothetical 
gonococcal protein (SEP ID NO; 1121) previously entered in the databases: 

sp | 033369 | YOR2_NEIGO HYPOTHETICAL 45.5 KD PROTEIN (ORF2) gnl | PID | ell54438 
(AJ002423) hypothetical protein [Neisseria gonorrh] Length = 417 
Score = 1512 (705.6 bits), Expect = 5.3e-203, P = 5.3e-203 
20 Identities = 301/326 (92%), Positives = 306/326 (93%) 

Query: 3 07 RQS LRLLSDGNDS PD I RHLSRLLDNLGS VDQQFRQLRHSDS PAENDRMGDTR I AALETGS 366 

RQSLRLLSDGNDS D I RHLSRLLDNLGS VDQQFRQLRHSDS PAENDRMGDTR I AALETGS 
Sbjct: 1 RQSLRLLSDGNDSXDIRHLSRLLDNLGSVDQQFRQLRHSDSPAENDRMGDTRI AALETGS 60 

Query: 367 FKNTWQAIRPQLNLESCVFRHAVRLSLWAAACTIVEALNLNLGYWILLTALFVCQPNYT 426 
25 FKNTWQAIRPQLNLES VFRHAVRLSLWAAACTIVEALNLNLGYWILLT LFVCQPNYT 

Sbjct: 61 FKNTWQAIRPQLNLESGVFRHAVRLSLWAAACTIVEALNLNLGYWILLTRLFVCQPNYT 120 

Query: 427 ATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSFSTFFIT 486 

ATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSFSTFFIT 
Sbjct: 121 ATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSFSTFFIT 180 

30 Query: 487 IQALTSLSLAGLDVYAAMPVRI IDTI IGASLAWAAVSYLWPDWKYLTLERTAALAVCSSG 546 

I Q ALTS LS LAGLD VY AAM PVR I IDTI IGASLAWAAVSYLWPDWKYLTLERTAALAVCSSG 
Sbjct: 181 IQALTSLSLAGLDVYAAMPVRI IDTI IGASLAWAAVSYLWPDWKYLTLERTAALAVCSSG 240 

Query: 54 7 TYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFADSLQPGFTLL 606 
TYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFAD+ P 
35 Sbjct: 241 TYLQKI AERLKTGETGDD I E YR I TRRRAHEHTAALS S TLSDMSS EPAKFADTCNPALPCS 300 

Query: 607 KTGYALTGY I S ALGAYRS EMHEECS P 632 

K ALTGYISALG ++ + +P 
Sbjct: 301 KPATALTGY I S ALGHTAAKCTKNAAP 326 

40 Based on this analysis, including the presence of several putative transmembrane domains in the 
gonococcal protein (the first of which is also seen in the meningococcal protein), and on homology 
with the YHFK protein (SEP ID NO: 1 120) , it is predicted that the proteins from N. meningitidis 
and N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
raising antibodies. 
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Example 14 

The following DNA sequence, believed to be complete, was identified in N. meningitidis [<SEQ ID 
1 13>] (SEP ID NO: 113) : 

1 ATGAATATGC TGGGAGCTTT GGCAAAAGTC GGCAGCCTGA CGATGGTGTC 

51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGG GCATTCGGCG 

101 CGGGTATGGC GACGGATGCG TTTTTTGTCG CGTTCAAACT GCCCAACCTG 

151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT 

201 TTTGGCGGAA TACAAGGAAA CGCGTTCAAA AGAGGCGG.C GAAGCCTTTA 

251 TCCGCCATGT GGCGGGGATG CTGTCGTTTG TACTGGTTAT CGTTACCGCG 

301 CTGGGCATAC TTGCCGCGCC TTGGGTGATT TATGTTTCCG CACCCGAGTT 

351 TTGCCCAAGA TGCCGACAAA TTTCAGCTCT CCATCGATTT GCTGCGGATT 

401 ACGTTTCCTT ATATATTATT GATTTCCCTG TCTTCATTTG TCGGCTCGGT 

451 ACTCAATTCT TATCATAAGT TCGGCATTCC ' GGCGTTTACG CCAC.GTTTC 

501 TGAACGTGTC GTTTATCGTA TTCGCGCTGT TTTTCGTGCC GTATTTCGAT 

551 CCGCCCGTTA CCGCGCyGGC GTGGGCGGTC TTTGTCGGCG GCATTTTGCA 

601 ACTCGrmTTC CAACTGCCCT GGCTGGCGAA ACTGGGCTTT TTGAAACTGC 

651 CCAAACtGAG TTTCAAAGAT GCGGCGGTCA ACCGCGTGAT GAAACAGATG 

701 GCGCCTGCgA TTTTgGGCGT GAgCGTGGCG CAGGTTTCTT TGGTGATCAA 

751 CACGATTTTc GCGTCTTATC TGCAATCGGG CAGCGTTTCA TGGATGTATT 

801 ACGCCGACCG CATGATGGAG CTGCCCAGCG GCGTGCTGGG GGCGGCACTC 

851 GGTACGATTT TGCTGCCGAC TTTGTCCAAA CACTCGGCAA ACCaAGATAC 

901 GGaACAGTTT TCCGCCCTGC TCGACTGGGG TTTGCGCCTG TGCATGCtgc 

951 TGACGCTGCC GGCGgcGGTC GGACTGGCGG TGTTGTCGTT cCCgCtGGTG 

1001 GCGACGCTGT TTATGTACCG CGwATTTACG CTGTTTGACG CGCAGATGAC 

1051 GCAACACGCG CTGATTGCCT ATTCTTTCGG TTTAATCGGC TTAATCATGA 

1101 TTAAAGTGTT GGCACCCGGC TTCTATGCGC GGCAAAACAT CAAwAmGCCC 

1151 GTCAAAATCG CCATCTTCAC GCTCATCTGC mCGCAGTTGA TGAACCTTGs 

1201 CTTTAyCGGC CCACTrrAAC rCasTCGGAC TTTCGCTTGC CATCGGTCTG 

1251 GGCGCGTGTA TCAATGCCGG ATTGTTGTTT TACCTGTTGC GCAGACACGG 

13 01 TATTTACCAA CCTGG.CAAG GGTTGGGCAG CGTTCTT . AG CAAAAATGCT 

13 51 GcTCTCGCTC GCCGTGA 

This corresponds to the amino acid sequence [<SEQ ID 114; ORF20>] (SEP ID NO: 114; 
PRF20): 



1 MNMLGALAKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL 

51 LRRVFAEGAF AQAFVPILAE YKETRSKEAX EAFIRHVAGM LSFVLVIVTA 

101 LGILAAPWVI YVSAPSFAQD ADKFQLSIDL LRITFPYILL ISLSSFVGSV 

151 LNSYHKFGIP AFTPXFLNVS FIVFALFFVP YFDPPVTAXA WAVFVGGILQ 

2 01 LXFQLPWLAK LGFLKLPKLS FKDAAVNRVM KQMAPAILGV SVAQVSLVIN 

251 TIFASYLQSG SVSWMYYADR MMELPSGVLG AALGTILLPT LSKHSANQDT 

301 EQFSALLDWG LRLCMLLTLP AAVGLAVLSF PLVATLFMYR XFTLFDAQMT 

351 QHALIAYSFG LIGLIMIKVL APGFYARQNI XXPVKIAIFT LICXQLMNLX 

401 FXGPLXXIGL SLAIGLGACI NAGLLFYLLR RHGIYQPXQG LGSVLXQKCC 

451 SRSP* 

These sequences were elaborated, and the complete DNA sequence [<SEQ ID 1 15>] (SEP ID NP: 
115) is: 



1 ATGAATATGC TGGGAGCTTT GGCAAAAGTC GGCAGCCTGA CGATGGTGTC 
51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGG GCATTCGGCG 
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101 CGGGTATGGC GACGGATGCG TTTTTTGTCG CGTTCAAACT GCCCAACCTG 

151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT 

201 TTTGGCGGAA TACAAGGAAA CGCGTTCAAA AGAGGCGGCG GAGGCTTTTA 

251 TCCGCCATGT GGCGGGGATG CTGTCGTTTG TACTGGTTAT CGTTACCGCG 

301 CTGGGCATAC TTGCCGCGCC TTGGGTGATT TATGTTTCCG CACCCGGTTT 

351 TGCCCAAGAT GCCGACAAAT TTCAGCTCTC CATCGATTTG CTGCGGATTA 

4 01 CGTTTCCTTA TATATTATTG ATTTCCCTGT CTTCATTTGT CGGCTCGGTA 

4 51 CTCAATTCTT ATCATAAGTT CGGCATTCCG GCGTTTACGC CCACGTTTCT 

501 GAACGTGTCG TTTATCGTAT TCGCGCTGTT TTTCGTGCCG TATTTCGATC 

551 CGCCCGTTAC CGCGCTGGCG TGGGCGGTCT TTGTCGGCGG CATTTTGCAA 

601 CTCGGCTTCC AACTGCCCTG GCTGGCGAAA CTGGGCTTTT TGAAACTGCC 

651 CAAACTGAGT TTCAAAGATG CGGCGGTCAA CCGCGTGATG AAACAGATGG 

701 CGCCTGCGAT TTTGGGCGTG AGCGTGGCGC AGGTTTCTTT GGTGATCAAC 

751 ACGATTTTCG CGTCTTATCT GCAATCGGGC AGCGTTTCAT GGATGTATTA 

801 CGCCGACCGC ATGATGGAGC TGCCCAGCGG CGTGCTGGGG GCGGCACTCG 

851 GTACGATTTT GCTGCCGACT TTGTCCAAAC ACTCGGCAAA CCAAGATACG 

901 GAACAGTTTT CCGCCCTGCT CGACTGGGGT TTGCGCCTGT GCATGCTGCT 

951 GACGCTGCCG GCGGCGGTCG GACTGGCGGT GTTGTCGTTC CCGCTGGTGG 

1001 CGACGCTGTT TATGTACCGC GAATTTACGC TGTTTGACGC GCAGATGACG 

1051 CAACACGCGC TGATTGCCTA TTCTTTCGGT TTAATCGGCT TAATCATGAT 

1101 TAAAGTGTTG GCACCCGGCT TCTATGCGCG GCAAAACATC AAAACGCCCG 

1151 TCAAAATCGC CATCTTCACG CTCATCTGCA CGCAGTTGAT GAACCTTGCC 

1201 TTTATCGGCC CACTGAAACA CGTCGGACTT TCGCTTGCCA TCGGTCTGGG 

1251 CGCGTGTATC AATGCCGGAT TGTTGTTTTA CCTGTTGCGC AGACACGGTA 

1301 TTTACCAACC TGGCAAGGGT TGGGCAGCGT TCTTAGCAAA AATGCTGCTC 

1351 TCGCTCGCCG TGATGTGCGG CGGACTGTGG GCAGCGCAGG CTTACCTGCC 

1401 GTTTGAATGG GCGCACGCCG GCGGAATGCG GAAAGCGGGG CAGCTCTGCA 

14 51 TCCTGATTGC CGTCGGCGGC GGACTGTATT TCGCATCACT GGCGGCTTTG 

1501 GGCTTCCGTC CGCGCCATTT CAAACGCGTG GAAAACTGA 

This corresponds to the amino acid sequence [<SEQ ID 116; ORF20-1>] (SEP ID NO: 116; 
PRF20-1) : 

1 MNMLGALAKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL 

51 LRRVFAEGAF AQAFVPILAE YKETRSKEAA EAFIRHVAG M LSFVLVIVTA 

101 LGILAAPWVI YVSAPGFAQD ADKFQLSIDL LRIT FPYILL ISLSSFVGSV 

151 LNSYHKFGIP AFTPTFLNVS FIVFALFFVP YFDPPVTALA WAVFVGGILQ 

201 LGFQLPWLAK LGFLKLPKLS FKDAAVNRVM KQ MAPAILGV SVAQVSLVI N 

251 TIFASYLQSG SVSWMYYADR MMELPSGVLG AALGTILLPT LSKHSANQDT 

301 EQFSALLDWG LR LCMLLTLP AAVGLAVLS F PLVATLFMYR EFTLFDAQMT 

3 51 QHALIAYSFG LIGLIMIKVL APGFYARQNI KTPVKIAIFT LICTQLMNLA 

4 01 FIGPLKHVGL S LAIGLGACI NAGLLFYL LR RHGIYQPGKG WA AFLAKMLL 
451 SLAVMCGGLW AAQAYLPFEW AHAGGMRKAG Q LCILIAVGG GLYFASLAA L 
501 GFRPRHFKRV EN* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with the MviN virulence factor of S. tvphimurium (accession number P37169) (SEP ID 
NP: 1122) 



PRF20 (SEP ID NP: 1 14) and MviN proteins (SEP ID NP: 1122) show 63% aa identity in 440aa 
overlap: 



I 

I 
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Orf20 



MviN 



14 



MNMLGALAKVGSLTMVSRVLGFVRDTVI ARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 6 0 
MN+L +LA V S+TM SRVLGF RD ++AR FGAGMATDAFFVAFKLPNLLRR+FAEGAF 
I^LLKBLAAVSSMTMFSRVLGFARDAIVARIFGAGMATDAFFVAFKLPNLLRRIFAEGAF 73 



10 



15 



20 



Orf20 61 AQAFVP I LAE YKETRS KEAXEAF I RHVAGMLS FVLVI VTALG I LAAPWV I YVS APS FAQD 120 

+QAFVPILAEYK + +EA F+ +V+G+L+ L +VT G+LAAPWVI V+AP FA 
MviN 74 SQAFVPII^EYKSKQGEEATRIFVAWSGLLTIALAWTVAGMLAAPWVIMVTAPGFADT 133 

Orf20 121 ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPXFLNVSFIVFALFFVP 180 

ADKF L+ LLRITFPYILLISL+S VG++LN++++F IPAF P FLN+S I FALF P 
MviN 134 ADKFALTTQLLRITFPYILLISLASLVGAILNTWNRFSIPAFAPTFLNISMIGFALFAAP 193 

Orf20 181 YFDPPVTAXAWAVFVGGILQLXFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV 240 

YF+PPV A AWAV VGG+LQL +QLP+L K+G L LP+++F+D RV+KQM PAILGV 
MviN 194 YFNPPVLALAWAVTVGGVLQLVYQLPYLKKIGMLVLPRINFRDTGAMRWKQMGPAILGV 253 

Orf20 241 SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT 300 

SV+Q+SL+INTIFAS+L SGSVSWMYYADR+ME PSGVLG ALGT I LLP + LS K A+ + 
MviN 254 SVSQISLI INTIFASFLASGSVSWMYYADRLMEFPSGVLGVALGTILLPSLSKSFASGNH 313 

Orf20 301 EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYRXFTLFDAQMTQHALIAYSFG 360 

+ + + L+DWGLRLC LL LP+AV L +L+ PL +LF Y FT FDA MTQ ALIAYS G 
MviN 314 DEYCRLMDWGLRLCFLLALPSAVALGILAKPLTVSLFQYGKFTAFDAAMTQRALIAYSVG 373 

Orf20 361 LIGLIMIKVLAPGFYARQNIXXPVKIAIFTLICXQLMNLXFXXXXXXXXXXXXXXXXXCI 420 

LIGLI ++KVLAPGFY+RQ+ I PVKIAI TLI QLMNL F C+ 
MviN 374 LIGLIWKVLAPGFYSRQDIKTPVKIAIVTLIMTQLMNLAFIGPLKHAGLSLSIGLAACL 433 



Orf20 421 NAGLLFYLLRRHGIYQPXQG 440 

NA LL++ LR+ 1+ P G 
MviN 434 NASLLYWQLRKQNIFTPQPG 4 53 



25 Homology with a predicted ORF from N. meningitidis (strain A) 

ORF20 (SEP ID NO: 114) shows 93.5% identity over a 447aa overlap with an ORF (ORF20a) 
(SEP ID NO: 118) from strain A of N. meningitidis: 



30 



10 20 30 40 50 60 

orf 20 . pep MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 

I I I I I I I :| I I I I I I M II I I I I I I . I I I I I M I I II I 11 I I I I I M I II I I I I II I I I 
orf 2 0a MNMLGALVKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFiCLPNLLRRVFAEGAF 

10 20 30 40 50 60 



35 



70 80 90 100 110 120 

orf 20 . pep AQAFVP I LAE YKETRS KEAXEAF I RHVAG MLS FVLV I VTALG I LAA PWV I YVS AP S FAQD 

I I I I I I I I I I II II I M hi I II I I II I I I I i I I I I I I I I I 1 1 1 1 Ml I I I h I h 
O r f 2 0 a AQAFVP I LAE YKETRS KEATEAF I RHVAG MLS FVLVI VTALG I LAA PWV I YVS APGFAKD 

70 80 90 100 110 120 



40 



130 140 150 160 170 180 

orf 20 .pep ADKFQLSIDLLRIT FPYILLISLSSFVGSVLN SYHKFGIPAFTPX FLNVSFIVFALFFVP 

IIIMI llllllll IIIMIinillMII'lhlllll :|IIIMIMIIIII 

O r f 2 0 a ADKFQLS I DLLR I T FPYILLISLSSFVGSVLN S YHKFS I PAFTPT FLNVSFIVFALFFVP 

130 140 150 160 170 180 



190 



200 



210 



220 



230 



240 
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orf 2 0 . pep YFDP P VTAXAWAVFVGGI LQLX FQLPWLAKLGFLKLPKLS FKDAAVNRVMKQ MAPAI LGV 

IIIIIMI III IIIIIIM I M 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 

orf 20a YFDPP VTALAWAVFVGGILQLG FQLPWLAKLGFLKLPKLS FKDAAVNRVMKQ MAPAI LGV 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 20 . pep SVAQVSLVINT I FAS YLQSGS VS WMYYADRMMELPSGVLGAALGT I LLPTLS KHS ANQDT 

I I I : I I I I I I I I I I I I I MM I I I I I I I I I II I U I I I I I I I I I I I I I I I I I I I I I i 
orf 2 0a S VAQ I S LVI NT I FAS YLQSGS VSWMYYADRMMELPGGVLGAALGT I LLPTLS KHS ANQDT 

250 260 270 280 290 300 

310 320 330 340 350 360 

or f 2 0 . pep EQFSALLDWGLR LCMLLTLPAAVGLAVLS FPLVATLFMYRXFTLFDAQMTQHA LIAYSFG 

IIIIIIIIIMI 1 1 1 II 1 1 1 1 Ml 1 1 1 M 1 1 1 1 1 1 1 1 MMMMMIIMM I 

or f 2 0a EQFSALLDWGLR XCMLLTLPAAVGMAVLS FPLVATLFMYREFTLFDAQMTQH ALIAYSFG 

310 320 330 340 350 360 

370 380 390 400 410 420 

orf 2 0 . pep LIGLIMIKVL APGFYARQNIXXPV KIAIFTLICXQLMNLXFX GPLXXIGLS LAIGLGACI 

MM MINI MM II II II MMMMMIIMM I III Mill IIIIIIM 

orf 2 0a LIGLIMIKVLA PGFYARQNIKTPV KIAIFTLICTQLMNLAFI GPLKHVGLS LAIGLGACI 

370 380 390 400 410 420 

430 440 450 

orf 20 .pep NAGLLFYL LRRHGIYQPXQGLGSVLXQKCCSRSPX 

I I I I I I I I M I I I I I : | : : | : 
O r f 2 0 a NAGLLFYL LRRHG I YQPGKGW AAFLAKMLLSLAVMGGGL YAAQ I WL P FD WAHAGGMQKAA 

430 440 450 460 470 480 

The complete length ORF20a nucleotide sequence [<SEQ ID 1 17>] (SEP ID NO: 117) is 

1 ATGAATATGC TGGGAGCTTT GGTAAAAGTC GGCAGCCTGA CGATGGTGTC 

51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGC GCATTCGGCG 

101 CAGGCATGGC GACGGATGCG TTCTTTGTCG CGTTCAAACT GCCCAACCTG 

151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT 

201 TTTGGCGGAA TATAAGGAAA CGCGTTCTAA AGAGGCGACG GAGGCTTTTA 

251 TCCGCCATGT GGCGGGGATG CTGTCGTTTG TACTGGTCAT CGTTACCGCG 

301 CTGGGCATAC TTGCCGCGCC TTGGGTGATT TATGTTTCCG CACCCGGTTT 

3 51 TGCCAAAGAT GCCGACAAAT TTCAGCTCTC TATCGATTTG CTGCGGATTA 

4 01 CGTTTCCTTA TATCTTATTG ATTTCACTTT CCTCTTTTGT CGGCTCGGTA 
4 51 CTCAATTCCT ATCATAAATT CAGCATTCCT GCGTTTACGC CCACGTTCCT 
501 GAACGTGTCG TTTATCGTAT TCGCGCTGTT TTTCGTGCCG TATTTCGATC 
551 CTCCCGTTAC CGCGCTGGCT TGGGCGGTTT TTGTCGGCGG CATTTTGCAA 
601 CTCGGCTTCC AACTGCCCTG GCTGGCGAAA CTGGGTTTTT TGAAACTGCC ' 
651 CAAACTGAGT TTCAAAGATG CGGCGGTCAA CCGCGTGATG AAACAGATGG 
701 CGCCTGCGAT TTTGGGCGTG AGCGTGGCGC AGATTTCTTT GGTGATCAAC 
751 ACGATTTTCG CGTCTTATCT GCAATCGGGC AGCGTTTCAT GGATGTATTA 
801 CGCCGACCGC ATGATGGAAC TGCCCGGCGG CGTGCTGGGG GCGGCACTCG 
851 GTACGATTTT GCTGCCGACT TTGTCCAAAC ACTCGGCAAA CCAAGATACG 
901 GAACAGTTTT CCGCCCTGCT CGACTGGGGT TTGCGCNTGT GCATGCTGCT 
951 GACGCTGCCG GCGGCGGTCG GAATGGCGGT GTTGTCGTTC CCGCTGGTGG 

1001 CAACCTTGTT TATGTACCGA GAATTCACGC TGTTTGACGC GCAGATGACG 

1051 CAACACGCGC TGATTGCCTA TTCTTTCGGT TTAATCGGTT TAATCATGAT 

1101 TAAAGTGTTG GCGCCCGGCT TTTATGCGCG GCAAAACATC AAAACGCCCG 

1151 TCAAAATCGC CATCTTCACG CTCATTTGCA CGCAGTTGAT GAACCTTGCC 

1201 TTTATCGGCC CACTGAAACA CGTCGGACTT TCGCTTGCCA TCGGTCTGGG 

1251 CGCGTGTATC AATGCCGGAT TGTTGTTTTA CCTGTTGCGC AGACACGGTA 

1301 TTTACCAACC TGGCAAGGGT TGGGCAGCGT TCTTGGCAAA AATGCTGCTC 

13 51 TCGCTCGCCG TGATGGGAGG CGGCCTGTAT GCCGCCCAAA TCTGGCTGCC 

14 01 GTTCGACTGG GCACACGCCG GCGGAATGCA AAAGGCCGCC CGGCTCTTCA 
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1451 TCCTGATTGC CGTCGGCGGC GGACTGTATT TCGCATCACT GGCGGCTTTG 
1501 GGCTTCCGTC CGCGCCATTT CAAACGCGTG GAAAGCTGA 

This encodes a protein having amino acid sequence [<SEQ ID 1 18>] (SEP ID NO: 118) : 



10 



15 



1 


MNMLGALVKV 


51 


LRRVFAEGAF 


101 


LGILAAPWVI 


151 


LNSYHKFSIP 


201 


LGFQLPWLAK 


251 


TIFASYLQSG 


301 


EQFSALLDWG 


351 


QHALIAYSFG 


401 


FIGPLKHVGL 


451 


SLAVMGGGLY 


501 


GFRPRHFKRV 



GSLTMVSRVL 
AQAFVPILAE 
YVSAPGFAKD 
AFTPTFLNVS 
LGFLKLPKLS 
SVSWMYYADR 
LRXCMLLTLP 



GFVRDTVIAR 
YKETRSKEAT 
ADKFQLSIDL 
FIVFALFFVP 
FKDAAVNRVM 
MMELPGGVLG 
AAVGMAVLSF 



LIGLIMIKVL 
SLAIGLGACI 



APGFYARQNI 
NAGLLFYLLR 



AAQIWLPFDW 
ES* 



AHAGGMQKAA 



AFGAGMATDA FFVAFKLPNL 
EAF I RHVAG M LSFVLVIVTA 
LRIT FPYILL ISLSSFVGSV 
YFDPPVTALA WAVFVGGILQ 
KQ MAPAILGV SVAQISLVI N 
AALGTILLPT 
PLVATLFMYR 
KTPVKIAIFT 
RHGIYQPGKG 
RLFILIAVGG 



LSKHSANQDT 
EFTLFDAQMT 
LICTQLMNLA 
WA AFLAKMLL 
GLYFASLAAL 



ORF20a (SEP ID NO: 118) and ORF20-1 (SEP ID NO: 116) show 96.5% identity in 512 aa 
overlap: 



20 



10 20 30 40 50 60 

orf20a.pep MNMLGALVKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 

MIIMhllllllllllllllllllMIIIIIIIIIIIIIIMMIIIIIIIIIIMM 

orf 2 0 - 1 MNMLGALAKVGSLTMVSRVLGFVRDTVI ARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 

10 20 30 40 50 60 



25 



70 80 90 100 110 120 

orf 20a . pep AQAFVPIIAEYKETRSKEATEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPGFAKD 

I lililllllllllllMIIII I IIIMIIIIIIIIIIIIIIIIMIIIIIM 

orf 20-1 AQAFVPILAEYKETRSKEAAEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPGFAQD 

70 80 90 100 110 120 



30 



130 140 150 160 170 180 

orf 20a. pep ADKFQLS IDLLRI TFP Y ILL I SLSS FVGS VLNS YHKFS I PAFTPTFLNVS FIVFALFFVP 

1 1 , M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 li: 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 : 1 I I M I I II 

orf 20-1 ADKFQLS IDLLRI TFP Y ILL I SLSS FVGS VLNS YHKFGI PAFTPTFLNVS FIVFALFFVP 

130 140 150 160 170 180 



35 



190 200 210 220 230 240 

orf 20a . pep YFDPPVTALAWAVFVGG I LQLGFQLPWLAKLGFLKLPKLS FKDAAVNRVM KQMAPAILGV 

I M I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 20 - 1 YFDPPVTALAWAVFVGG I LQLGFQLPWLAKLGFLKLPKLS FKDAAVNRVMKQMAPAILGV 

190 200 210 220 230 240 



40 



250 260 270 280 290 300 

orf 20a . pep SVAQISLVINTIFASYLQSGSVSWMYYADRMMELPGGVLGAALGTILLPTLSKHSANQDT 

I I I : I i I I I M I I I I I I I I I I I M I I I I II I I I :i I I I I II I I I I I I I I I I M I I I 
orf20-l SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT 

250 260 270 280 290 300 



45 



310 320 330 340 350 360 

orf 20a . pep EQFSALLDWGLRXCMLLTLPAAVGMAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG 

II IMMIII IIIIIIIMI UIIIIIIMIMIIMIIMI I llllllllll 

orf 20 - 1 EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG 

310 320 330 340 350 360 
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370 380 390 400 410 420 

orf2 0a.pep LIGLIMIKVLAPGFYARQNIKTPVKIAIFTLICTQLMNI^FIGPLKHVGLSLAIGLGACI 

llllll IIIIIIIMIMIIIIMMII IIIIIIIIIMIIIIIIIIMI Mill 

orf 20-1 LIGLIMIKVLAPGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHVGLSLAIGLGACI 
5 370 380 390 400 410 420 

430 440 450 460 470 480 

orf 20a . pep NAGLLFYLLRRHGIYQPGKGWAAFIAKMLLSLAVMGGGLYAAQIWLPFDWAHAGGMQKAA 

I I I I I I ! I I I I I I II I I I I I M I M ! I I h I I M I M I : I I I hlhhhhhlh 
orf 20-1 NAGLLFYLLRRHGIYQPGKGWAAFLAKMLLSLAVMCGGLWAAQAYLPFEWAHAGGMRKAG 
10 430 440 450 460 470 480 

490 500 510 

orf 20a .pep RLFILIAVGGGLYFASLAALGFRPRHFKRVESX 

I h 1 1 1 1 1 1 1 1 h I II II 1 1 1 1 II h h Ih 

orf 2 0 - 1 QLC I LI AVGGGLYFASLAALGFRPRHFKRVENX 

15 490 500 510 

Homology with a predicted ORF from N gonorrhoeae 

ORF20 (SEP ID NO: 1 14) shows 92.1% identity over a 454aa overlap with a predicted ORF 
(ORF20ng) (SEP ID NO: 120) from N. gonorrhoeae: 
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orf 20 .pep 
orf 20ng 
orf 20 .pep 
orf 20ng 
orf 20 .pep 
orf 20ng 
orf 20 .pep 
orf 20ng 
orf 20 .pep 
orf 20ng 
orf 20 .pep 
orf20ng 
orf 20 .pep 
orf 20ng 
orf 20 .pep 
orf 20ng 



MNMLGAI^KVGSLTWSRVLGFVRDTVIARAFGAGNIATDAFFVAFKLPNLLRRVFAEGAF 

I h h h 1 1 1 h 1 1 h II I II h 1 1 1 h 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 h 1 1 1 h 1 1 1 h 

MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 



ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPXFLNVSFIVFALFFVP 

I I : I I I I 1 I I I [ I 1 [ I : ! I ! I I 1 I 1 I I I I I I : 1 I I : I : I I I [ I I I I 

ADKFQLSISLLRITFPYILLISLSSFVGSILNSYHKFGIPAFTPTFLNISFIVFALFFVP 



60 



60 



120 



AQAFVP I LAE YKETRS KEAXEAF I RHVAGMLS FVLVI VTALG I LAAPWV I YVS APS FAQD 

h h I h 1 1 1 1 1 1 1 II I h h 1 1 1 1 1 h 1 1 1 1 hh 1 1 II h 1 1 1 1 1 h Ih hhh 

AQAFVP I LAE YKETRS KEATEAF I RHVAGMLS FVL I WTALG I LAAPWV I YVS APGFTKD 120 



180 



180 



240 



YFDPPVTAXAWAVFVGGILQLXFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV 

llllll llllllllllll llllll lllllllllhllhlllh hhllll 

YFDPPVTALAWAVFVGGILQLGFQLPWI^KLGFLKLPKLNFKDAAVNRVMKQMAPAILGV 240 

S VAQVSLVINT I FAS YLQSGS VSWMYYADRMMELPSGVLGAALGT I LLPTLSKHS ANQDT 300 
h II h II I I I I I I I I I I I h h II I I I I I I I I hi I I I I II I I I I I I I I I I III I I 
SVAQISLVINTIFASYLQSGSVSWMYYADRMMELPGGVLGAALGTILLPTLSKHSANQDT 300 

EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYRXFTLFDAQMTQHALIAYSFG 3 60 

hi I h h I I h I I I III I I I I h I I I h I I I I II I I I II llllllll hlllllh 
EQFSALLDWGLRLCMLLTLPAAAGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG 360 

LIGLIMIKVLAPGFYARQNIXXPVKIAIFTLICXQLMNLXFXGPLXXIGLSLAIGLGACI 420 

1 1 1 1 M I M 1 1 llllllll ^ 1 1 1 : 1 II 1 1 1 = 1 1 1 1 1 I III IMIIIIIIIII 

LIGLIMIKVLASGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHAGLSLAIGLGACI 420 

NAGLLF YLLRRHG I YQPXQGLGS VLXQKCCSRS P 4 54 

llll hhlhhhl ||||: : llllll 
NAGLLFFLFRKHGI YRPGQGLGQPSWRKCCSRSP 4 54 
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An ORF20ng nucleotide sequence [<SEQ ID 1 19>] (SEP ID NO: 119) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 120>] (SEP ID NO: 120) : 

1 MNMLGALAKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL 

51 LRRVFAEGAF AQAFVP I LAE YKETRSKEAT EAFIRHVAGM LSFVLIWTA 

101 LGILAAPWVI YVSAPGFTKD ADKFQLSISL LRITFPYILL ISLSSFVGSI 

151 LNSYHKFGIP AFTPTFLNIS FIVFALFFVP YFDPPVTALA WAVFVGGILQ 

201 LGFQLPWLAK LGFLKLPKLN FKDAAVNRVM KQMAPAILGV SVAQISLVIN 

251 TIFASYLQSG SVSWMYYADR MMELPGGVLG AALGTILLPT LSKHSANQDT 

301 EQFSALLDWG LRLCMLLTLP AAAGLAVLSF PLVATLFMYR EFTLFDAQMT 

351 QHALIAYSFG LIGLIMIKVL ASGFYARQNI KTPVKIAIFT LICTQLMNLA 

4 01 FIGPLKHAGL SLAIGLGACI NAGLLFFLFR KHGIYRPGQG LGQPSWRKCC 

451 SRSP* 

Further DNA sequence analysis revealed the following DNA sequence [<SEQ ID 121>] (SEP ID 
NO: 121) : 

1 ATGAATATGC TTGGAGCTTT GGCAAAAGTC GGCAGCCTGA CGATGGTGTC 

51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGG GCATTCGGCG 

101 CGGGTATGGC GACGGATGCG TTTTTTGTCG CGTTCAAACT GCCCAACCTG 

151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT 

201 TTTGGCGGAA TATAAGGAAA CGCGTTCTAA AGAGGCGAcg gAGGCTTTTA 

251 TCCGCCACGt tgcgggAatg CTGTCGTTTG TGCTGATcgt cGttacCGCG 

301 CTGGGCATAC TTGCCGCgcc tTGGGTGATT TATGTTtccg CgcccGGCTT 

3 51 TACCAAAGAC GCGGACAAGT TCCAACTTTC CATCAGCCTG CTGCGGATTA 

401 CGTTTCCTTA TATATTATTG ATTTCTTTGT CTTCTTTTGT CGGCTCGATA 

451 CTCAATTCCT ACCATAAGTT CGGCATTCCC GCGTTTACGC CCACGTTTTT 

501 AAACATCTCT TTTATCGTAT TCGCACTGTT TTTCGTGCCG TATTTCGATC 

551 CGCCCGTTAC CGCGCTGGCG TGGGCGGTTT TTGTCGGCGG TATTTTGCAG 

601 CTCGGTTTCC AACTGCCGTG GCTGGCGAAA CTGGGCTTTT TGAAACTGCC 

651 CAAACTGAAT TTCAAAGATG CGGCGGTCAA CCGCGTCATG AAACAGATGG 

701 CGCCTGCGAT TTTGGGCGTG agcgTGGCGC AAATTTCTTT GgttATCAAC 

751 ACGATTTTCG CGTCTTATCT GCAATCGGGC AGCGTTTCAT GGATGTatta 

801 cgCCGACCGC ATGATGGAGc tgcgccGGGG CGTGCTGGGG GCTGCACTCG 

851 GTACAATTTT GCTGCCGACT TTGTCCAAAC ACTCGGCAAA CCAAGATACG 

901 GAACAGTTTT CCGCCCTGCT CGACTGGGGT TTGCGCCTGT GCATGCTGCT 

951 GACGCTGCCG GCGGCGGccg GACTGGCGGT ATTGTCGTTC CCGCTGGTGG 

1001 CGACGCTGTT TATGTACCGA GAATTCACGC TGTTTGACGC ACAAATGACG 

1051 CAACACGCGC TGATTGCCTA TTCTTTCGGT TTAATCGGTT TAATTATGAT 

1101 TAAAGTGTTG GCATCCGGCT TTTATGCGCG GCAAAACATC AAAACGCCCG 

1151 TCAAAATCGC CATCTTCACG CTCATCTGCA CGCAGTTGAT GAACCTCGCC 

12 01 TTTATCGGTC CGTTGAAACA CGCCGGGCTT TCGCTCGCCA TCGGCCTGGG 

12 51 CGCGTGCATC AACGCCGGAT TGTTGTTCTT CCTGTTGCGC AAACACGGTA 
1301 TTTACCGGCC cggcaggggt tgggcggcgt TCTTGGCGAA AATGCTGCTC 

13 51 GCGCTCGCCG TGATGTGCGG CGGACTGTGG GCGGCGCAGG CTTGCCTGCC 

14 01 GTTCGAATGG GCGCACGCCG GCGGAATGCG GAAAGCGGGG CAGCTCTGCA 
■14 51 TCCTGATTGC CGTCGGCGGC GGACTGTATT TCGCATCTCT GGCGGCTTTG 

1501 GGCTTCCGTC CGCGCCATTT CAAACGCGTG GAAAGCTGA 

This encodes the following amino acid sequence [<SEQ ID 122; PRF20ng-l>] fSEO ID NO: 122; 
ORF20ng-l) : 



1 MNMLGALAKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL 
51 LRRVFAEGAF AQAFVP I LAE YKETRSKEAT EAFIRHVAG M LSFVLIWTA 
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101 LGILAAPWVI YVSAPGFTKD ADKFQLSISL LRIT FPYILL ISLSSFVGSI 

151 LNSYHKFGIP AFTPTFLNIS FIVFALFFVP YFDPPVTALA WAVFVGGILQ 

2 01 LGFQLPWLAK LGFLKLPKLN FKDAAVNRVM KQ MAPAILGV SVAQISLVI N 

2 51 TIFASYLQSG SVSWMYYADR MMELRRGVLG AALGTILLPT LSKHSANQDT 

5 301 EQFSALLDWG LR LCMLLTLP AAAGLAVLS F PLVATLFMYR EFTLFDAQMT 

351 QHALIAYSFG LIGLIMIKVL ASGFYARQNI KTPVKIAIFT LICTQLMNLA 

4 01 FIGPLKHAGL S LAIGLGACI NAGLLFFL LR KHGIYRPGRG WA AFLAKMLL 

4 51 ALAVMCGGL W AAQACLPFEW AHAGGMRKAG Q LCILIAVGG GLYFASLAA L 

501 GFRPRHFKRV ES* 

10 

ORF20ng-l (SEP ID NO: 122) and ORF20-1 (SEP ID NO: 116) show 95.7% identity in 512 aa 
overlap: 



15 



10 20 * 30 40 50 60 

orf 20-1. pep ^^LGAI^KVGSLT^fVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 !i 1 1 1 1 1 U 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 

orf20ng-l MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 

10 20 30 40 50 60 



20 



70 80 90 100 110 120 

orf20-l .pep AQAFVP I LAEYKETRS KEAAEAF I RHVAGMLS FVLVI VTALG I LAAPWVI YVS APGFAQD 

MIMIIIIIIIIIII Ihllllllllll Mil- MIIMIIIIIIIIIIIII- 

orf20ng-l AQAFVP I LAEYKETRS KEATEAF I RHVAGMLS FVL I WTALG I LAAPWVI YVSAPGFTKD 

70 80 90 100 110 120 



25 



130 140 150 160 170 180 

orf 20-1. pep ADKFQLS I DLLR I TFPY ILL I SLSSFVGSVLNSYHKFGIPAFTPTFLNVS FIVFALFFVP 

llllllll: lllllllllllll IMMmIIIMI Mlllllllh IIMIIMI! 

orf20ng-l ADKFQLS ISLLRITFPYILLISLSSFVGS I LNSYHKFGIPAFTPTFLNISFIVFALFFVP 

130 140 150 160 170 180 



30 



190 200 210 220 230 240 

orf 20-1 .pep YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV 
I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I IM I I 
orf20ng-l YFDPPVTALAWAVFVGGILQLGFQLPWI^KLGFLKLPKLNFKDAAVNRVMKQMAPAILGV 

190 200 210 220 230 240 



35 



250 260 270 280 290 300 

orf 2 0 - 1 . pep SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT 

I hllll IIIIIMIIIIII IIIMIMI illllllllllll llllll II 
orf 20ng-l SVAQISLVINTIFASYLQSGSVSWMYYADRMMELRRGVLGAALGTILLPTLSKHSANQDT 

250 260 270 280 290 300 



40 



310 320 330 340 350 360 

orf 20-1. pep EQFSALLDWGLRLCMLLTLPAAVGLAVLS FPLVATLFMYREFTLFDAQMTQHAL I AYS FG 
I I I I I I I I I I I I I II I I I I I I h I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I 
orf20ng-l EQFSALLDWGLRLCMLLTLPAAAGLAVLS FPLVATLFMYREFTLFDAQMTQHAL I AYS FG 

310 320 330 340 350 360 



45 



370 380 390 400 410 420 

orf 20-1 .pep L I GL I M I KVLAPGFYARQN I KTPVKI AI FTL I CTQLMNLAF I GPLKHVGLSLAIGLGAC I 

II I II Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I h I I I I II I I I I 
orf20ng-l LIGLIMIKVLASGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHAGLSLAIGLGACI 

370 380 390 400 410 420 



50 



430 440 450 460 470 480 

orf 20-1. pep NAGLLFYLLRRHGIYQPGKGWAAFLAKMLLSLAVMCGGLWAAQAYLPFEWAHAGGMRKAG 
I II I I I : I I I : I I I I : I I : I I I I I I I I I I h I I I I I I I I I I II I II I ! I I I I I I II I I I 
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orf20ng-l NAGLLFFLLRKHGIYRPGRGWAAFLAKMLLALAVMCGGLWAAQACLPFEWAHAGGMRKAG 

430 440 450 460 470 480 



10 



490 500 510 

orf 2 0 - 1 . pep QLCILIAVGGGLYFASLAALGFRPRHFKRVENX 

I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I : I 
orf 20ng- 1 QLCI LI AVGGGLYFASLAALGFRPRHFKRVESX 

490 500 510 



In addition, ORF20ng-l (SEP ED NO: 122) shows significant homology with a virulence factor 
(SEP ID NO: 1122) of S. typhimurium: 
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sp|P37169|MVIN_SALTY VIRULENCE FACTOR MVIN pir||S40271 mviN protein - Salmonella 
typhimurium gi | 438252 (Z26133) mviB gene product [Salmonella typhimurium] 
gnl | PID |dl005521 (D25292) ORF2 [Salmonella typhimurium] Length = 524 

Score = 1573 (750.1 bits), Expect = l.le-220, Sum P(2) = l.le-220 

Identities = 309/467 (66%), Positives = 368/467 (78%) 

Query: 1 MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 60 

MN+L +LA V S+TM SRVLGF RD ++AR FGAGMATDAFFVAFKLPNLLRR+FAEGAF 
Sbjct: 14 MNLLKSLAAVSSMTMFSRVLGFARDAIVARI FGAGMATDAFFVAFKLPNLLRRI FAEGAF 73 

Query: 61 AQAFVP I LAE YKETRS KEATE AF I RHVAGMLS FVL I WTALGI LAAPWV I YVS APGFTKD 120 

+QAFVPILAEYK + +EAT F+ +V+G+L+ L WT G+LAAPWVI V+APGF 
Sbjct: 74 SQAFVPILAEYKSKQGEEATRIFVAYVSGLLTLALAVVTVAGMLAAPWVIMVTAPGFADT 133 

Query: 121 ADKFQLSISLLRITFPYILLISLSSFVGSILNSYHKFGIPAFTPTFLNISFIVFALFFVP 180 

ADKF L+ LLRITFPYILLISL+S VG+ILN++++F IPAF PTFLNIS I FALF P 
Sbjct: 134 ADKFALTTQLLRITFPYILLISLASLVGAILNTWNRFSIPAFAPTFLNISMIGFALFAAP 193 

Query: 181 YFDPPVTALAWAVFVGGILQLGFQLPW^KLGFLKiPKLNFKDAAVNRVMKQMAPAILGV 240 

YF+PPV ALAWAV VGG+LQL +QLP+L K+G L LP++NF+D RV+KQM PAILGV 
Sbjct: 194 YFNPPVLALAWAVTVGGVLQLVYQLPYLKKIGMLVLPRINFRDTGAMRVVKQMGPAILGV 2 53 

Query: 241 S VAQ I SLV INT I FAS YLQSGS VS WMYYADRMMELRRGVLGAALGT I LLPTLS KHSANQDT 300 

SV+QISL+INTIFAS+L SGSVSWMYYADR+ME GVLG ALGTILLP+LSK A+ + 
Sbjct: 254 SVSQI SLI INT I FAS FLASGS VSWMYYADRLMEFPSGVLGVALGTI LLPSLSKSFASGNH 313 

Query: 301 EQFSALLDWGLRLCMLLTLPAAAGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG 360 

+++ L+DWGLRLC LL LP+A L +L+ PL +LF Y +FT FDA MTQ ALIAYS G 
Sbjct: 314 DEYCRLMDWGLRLCFLLALPSAVALGILAKPLTVSLFQYGKFTAFDAAMTQRALIAYSVG 3 73 

Query: 361 LIGLIMIKVLASGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHAGLSLAIGLGACI 420 

LIGLI++KVLA GFY+RQ+ I KTPVKI AI TLI TQLMNLAFIGPLKHAGLSL+IGL AC+ 
Sbjct: 374 LIGLIVVKVLAPGFYSRQDIKTPVKIAIVTLIMTQLMNLAFIGPLKHAGLSLSIGLAACL 433 

Query: 421 NAGLLFFLLRKHGIYRPGRGWXXXXXXXXXXXXVMCGGLWAAQACLP 467 

NA LL++ LRK 1+ P GW VM L+ +P 

Sbjct: 434 NAS LL YWQLRKQN I FT PQ PGWMWFLMRL 1 1 S VLVMAAVL FG VLH IMP 480 

Score = 70 (33.4 bits), Expect = l.le-220, Sum P{2) = l.le-220 
Identities = 14/41 (34%), Positives = 23/41 (56%) 



Query: 469 EWAHAGGMRKAGQLC I L I AVGGGLYFASLAALGFRPRHFKR 509 

EW+ + + +L ++ G YFA+LA LGF+ + F R 
Sbjct: 481 EWSQGSMLWRLLRLMAWIAGIAAYFAALAVLGFKVKEFVR 521 
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Based on this analysis, including the homology with a virulence factor (SEP ID NO: 1 122) from 
S.typhimurium, it is predicted that these proteins from N. meningitidis and N. gonorrhoeae, and their 
epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 15 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 123>] (SEP ID 
NO: 123) : 

1 atGATTAAAA TCAAAAAAGG TCTAAACCTG CCCATCGCGG GCAGACCGGA 

51 GCAAGCCGTT tACGACGGCC CGGCCaTTAC CGAAGtCGCG TTGCTTGGCG 

101 AAGAATATGC CGGTATGCGC CCCTCGATGA AAGTCAAGGA AGGCGATGCC 

151 GTcAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGAATC CGGGCGTGGT 

2 01 GTTTACTGCG CCGGCTTCAG GcAAAATCGC CGCGATTCAC CGTGGCGAAA 
251 AGCGCGTACT TCAGTCAGTC GTGATTGCCG TTGAArGCAA CGACGAAATC 
301 GAGTTTGAAC GCTACGCACC TGAAGCGCTG GCAAACTTAA GCGGCGAAGA 

3 51 AGTGCGCCGC AACCTGATCC AATCCGGTTT GTGGACTGCG CTGCGCACCC 

4 01 GTCCGTTCAG CAAAATTCCT GCCGTCGATG CCGAGCCGTT CGCCATCTTC 
4 51 GTCAATGCGA tGGACACCAA TCCG. . 

This corresponds to the amino acid sequence [<SEQ ID 124; ORF22>] (SEP ID NO: 124; 
ORF22) : 

1 MIKIKKGLNL PIAGRPEQAV YDGPAITEVA LLGEEYAGMR PSMKVKEGDA 
51 VKKGQVLFED KKNPGWFTA PASGKIAAIH RGEKRVLQSV VIAVEXNDEI 
101 EFERYAPEAL ANLSGEEVRR NLIQSGLWTA LRTRPFSKIP AVDAEPFAIF 
151 VNAMDTNP . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 125>] (SEP ID NO: 125) : 



1 ATGATTAAAA TCAAAAAAGG TCTAAACCTG CCCATCGCGG GCAGACCGGA 

51 GCAAGCCGTT TACGACGGCC CGGCCATTAC CGAAGTCGCG TTGCTTGGCG 

101 AAGAATATGC CGGTATGCGC CCCTCGATGA AAGTCAAGGA AGGCGATGCC 

151 GTCAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGAATC CGGGCGTGGT 

2 01 GTTTACTGCG CCGGCTTCAG GCAAAATCGC CGCGATTCAC CGTGGCGAAA 
251 AGCGCGTACT TCAGTCAGTC GTGATTGCCG TTGAAGGCAA CGACGAAATC 
301 GAGTTTGAAC GCTACGCACC TGAAGCGCTG GCAAACTTAA GCGGCGAAGA 

3 51 AGTGCGCCGC AACCTGATCC AATCCGGTTT GTGGACTGCG CTGCGCACCC 

4 01 GTCCGTTCAG CAAAATTCCT GCCGTCGATG CCGAGCCGTT CGCCATCTTC 
4 51 GTCAATGCGA TGGACACCAA TCCGCTGGCT GCCGACCCTA CGGTCATTAT 
501 CAAAGAAGCC GCCGAGGATT TCAAACGCGG CCTGTTGGTA TTGAGCCGTT 
551 TGACCGAACG CAAAATCCAT GTTTGTAAGG CAGCTGGCGC AGACGTGCCG 
601 TCTGAAAATG CTGCCAACAT CGAAACACAT GAATTCGGCG GCCCGCATCC 
651 TGCCGGTTTG AGTGGCACGC ACATTCATTT CATCGAGCCG GTCGGCGCGA 
701 ATAAAACCGT GTGGACCATC AATTATCAAG ATGTAATTAC CATTGGCCGT 
751 TTGTTTGCAA CAGGCCGTCT GAACACCGAG CGCGTGATTG CCCTAGGTGG 
801 TTCTCAAGTC AACAAACCGC GCCTCTTGCG TACCGTTTTG GGTGCGAAAG 
851 TATCGCAAAT TACTGCGGGC GAATTGGTTG ACACAGACAA CCGCGTGATT 
901 TCCGGTTCGG TATTGAACGG CGCGATTACA CAAGGCGCGC ACGATTATTT 
951 GGGACGCTAC CACAATCAGA TTTCCGTTAT CGAAGAAGGC CGCAGCAAAG 

1001 AGCTGTTCGG CTGGGTTGCG CCGCAGCCGG ACAAATACTC CATCACGCGT 
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1051 ACAACCCTCG GCCATTTCCT GAAAAACAAA CTCTTCAAGT TCAACACAGC 

1101 CGTCAACGGC GGCGACCGCG CCATGGTGCC GATTGGTACT TACGAGCGCG 

1151 TGATGCCCTT GGATATCCTG CCCACCCTGC TTTTGCGCGA TTTAATCGTC 

1201 GGCGATACCG ACAGCGCGCA GGCATTGGGT TGCTTGGAAT TGGACGAAGA 

1251 AGACCTCGCT TTGTGCAGCT TCGTCTGCCC GGGCAAATAC GAATACGGCC 

1301 CGCTGTTGCG CAAAGTGCTG GAAACCATTG AGAAGGAAGG CTGA 

This corresponds to the amino acid sequence [<SEQ ID 126; ORF22-l>] (SEP ID NO: 126; 
ORF22-1): 



1 MIKIKKGLNL PIAGRPEQAV YDGPAITEVA LLGEEYAGMR PSMKVKEGDA 

51 VKKGQVLFED KKNPGWFTA PASGKIAAIH RGEKRVLQSV VIAVEGNDEI 

101 EFERYAPEAL ANLSGEEVRR NLIQSGLWTA LRTRPFSKIP AVDAEPFAIF 

151 VNAMDTNP LA ADPTVIIKEA AEDFKRGLLV LSRLTERKIH VCKAAGADVP 

201 SENAANIETH EFGGPHPAGL SGTHIHFIEP VGANKTVWTI NYQDVITIGR 

251 LFATGRLNTE RVIALGGSQV NKPRLLRTVL GAKVSQITAG ELVDTDNRVI 

301 SGSVLNGAIT QGAHDYLGRY HNQISVIEEG RSKELFGWVA PQPDKYSITR 

351 TTLGHFLKNK LFKFNTAVNG GDRAMVPIGT YERVMPLDIL PTLLLRDLIV 

4 01 GDTDSAQALG CLELDEEDLA LCSFVCPGKY EYGPLLRKVL ETIEKEG* 

Further work identified the corresponding gene in strain A of N. meningitidis [<SEQ ID 127>] 
(SEP ID NO: 127) : 



1 ATGATTAAAA TCAAAAAAGG TCTAAACCTG CCCATCGCGG GCAGACCGGA 

51 GCAAGTCATT TATGACGGGC CCGTCATTAC CGAAGTCGCG TTGCTTGGCG 

101 AAGAATATGC CGGTATGCGC CCCTNGATGA AAGTCAAGGA AGGCGATGCC 

151 GTCAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGNATC CGGGCGTGGT 

2 01 GTTTACCGCG CCNGTTTCAG GCAAAATCGC CGCCATCCAT CGCGGCGAAA 
251 AGCGCGTACT TCAGTCGGTC GTGATTGCCG TTGAAGGCAA CGACGAAATC 

3 01 GAGTTCGAAC GCTACGCGCC CGAAGCGTTG GCAAACTTAA GCGGCGANGA 
351 ANTNNGNNGC AATCTGATCC AATCCGGTTT GTGGACTGCG CTGCGTANCC 

4 01 GTCCGTTCAG CAAAATCCCT GCCGTCGATG CCGAGCCGTT CGCCATCTTC 
4 51 GTCAATGCGA TGGACACCAA TCCGCTNGCG GCAGACCCTG TGGTTGTGAT 
501 CAAAGAAGCC GNCGANGATT TCAGACGANG TNTGCTGGTA TTGAGCCGTT 
551 TGACCGAGCG TAAAATCCAT GTGTGTAAGG CAGCTGGCGC AGACGTGCCG 
601 TCTGAAAATG CTGCCAACAT CGAAACACAT GAATTCGGCG GCCCGCATCC 
651 GGCCGGTTTG AGTGGCACGC ACATTCATTT CATTGAGCCG GTCGGTGCAA 
701 ACAAAACCGT TTGGACCATC AATTATCAAG ATGTAATTGC CATCGGACGT 
751 TTGTTTGCAA CAGGCCGTCT GAACACCGAG CGCGTGATTG CTTTGGGTGG 
801 TTCTCAAGTC AACAAACCAC GCCTCTTGCG TACCGTTTTG GGTGCGAAAG 
851 TATCGCAAAT TACTGCGGGC GAATTGGTTG ACGCAGACAA CCGCGTGATT 
901 TCCGGTTCGG TATTGAACGG CGCGATTACA CAAGGCGCGC ACGATTATTT 
951 GGGACGCTAC CACAATCAGA TTTCCGTTAT CGAAGAAGGC CGCAGCAAAG 

1001 AGCTGTTCGG CTGGGTTGCG CCGCAGCCGG ACAAATACTC CATCACGCGT 

1051 ACGACCCTCG GCCATTTCCT GAAAAACAAA CTCTTCAAGT TCACGACAGC 

1101 CGTCAACGGT GGCGACCGCG CCATGGTGCC GATTGGTACT TACGAGCGCG 

1151 TAATGCCGCT AGACATCCTG CCTACCCTGC TTTTGCGCGA TTTAATCGTC 

12 01 GGCGATACCG ACAGCGCGCA AGCATTGGGT TGCTTGGAAT TGGACGAAGA 
1251 AGACCTCGCT TTGTGCAGCT TCGTCTGCCC GGGCAAATAC GAATANGGCC 

13 01 CGCTGTTGCG TAAGGTGCTG GAAACCNTTG AGAAGGAAGG CTGA 



This encodes a protein having amino acid sequence [<SEQ ID 128; ORF22a>] fSEO ID NO: 128; 
ORF22a) : 
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i 

51 
101 
151 
201 
251 
301 
351 
401 



MIKIKKGLNL 
VKKGQVLFED 
EFERYAPEAL 
VNAMDTNPLA 
SENAANIETH 
LFATGRLNTE 
SGSVLNGAIT 
TTLGHFLKNK 
GDTDSAQALG 



PIAGRPEQVI 
KKXPGWFTA 
ANLSGXEXXX 
ADPVWIKEA 
EFGGPHPAGL 
RVIALGGSQV 
QGAHDYLGRY 
LFKFTTAVNG 
CLELDEEDLA 



YDGPVITEVA 
PVSGKIAAIH 
NLIQSGLWTA 
XXDFRRXXLV 
SGTHIHFIEP 
NKPRLLRTVL 
HNQISVIEEG 
GDRAMVPIGT 
LCSFVCPGKY 



LLGEEYAGMR 
RGEKRVLQSV 
LRXRPFSKIP 
LSRLTERKIH 
VGANKTVWTI 
GAKVSQITAG 
RSKELFGWVA 
YERVMPLDIL 
EXGPLLRKVL 



PXMKVKEGDA 
VI AVEGNDE I 
AVDAEPFAIF 
VCKAAGADVP 
NYQDVIAIGR 
ELVDADNRVI 
PQPDKYSITR 
PTLLLRDLIV 
ETXEKEG* 



The originally-identified partial strain B sequence (ORF22) (SEP ID NO: 124) shows 94.2% 
identity over a 158aa overlap with ORF22a (SEP ID NO: 128) : 



15 



10 20 30 40 50 60 

orf 22 . pep MI KI KKGLNLP IAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED 
MII!IIIIIIIIIMI|::Mi|: I MIIIIIIMIM I I I I I I I I I I I I I I I I II 
orf 22a M I KI KKGLNLP I AGRPEQVI YDGPVI TEVALLGEE YAGMRPXMKVKEGDAVKKGQVLFED 

10 20 30 40 50 60 



20 



70 80 90 100 110 120 

orf 22 . pep KKNPGWFTAPASGKIAA I HRGEKRVLQSVV I AVEXNDE I EFERYAPEALANLSGEEVRR 

II 1 . 1 1 1 1 1 1 : 1 1 1 1 1 1 M I M M 1 1 1 1 1 1 i M II 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 I 

orf 22a KKX PGWFTAPVSGKIAA I HRGEKRVLQSW I AVEGNDE I EFERYAPEALANLSGXEXXX 

70 80 90 100 110 120 



130 140 150 

orf 22 .pep NL I QS GLWTALRTRP FS K I P AVDAE P FA I F VNAMDTNP 

25 | | | | | | | | | | | | : | | | | | || | | | | | | | | | | | | | | | | | | 

orf 22a NLIQSGLWTALRXRPFSKIPAVDAEPFAIFVNAMDTNPLAADPVWIKEAXXDFRRXXLV 

130 140 150 160 170 180 

The complete strain B sequence (PRF22-1) (SEP ID NP: 126) and PRF22a (SEP ID NP: 128) 
30 show 94.9% identity in 447 aa overlap: 



35 



10 20 30 40 50 60 

orf 22a . pep M I KI KKGLNLP I AGRPEQVI YDGPVI TEVALLGEEYAGMRPXMKVKEGDAVKKGQVLFED 

' I I I I I II II I II II I I I -I I I I U I I I I I I I I I I I I I I I ' I I I I I I I I I I I I I I 
orf 22 - 1 M I KI KKGLNLP I AGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED 

10 20 30 40 50 60 



40 



70 80 90 100 110 120 

orf 22a . pep KKXPGWFTAPVSGKIAA I HRGEKRVLQSVV I AVEGNDE I EFERYAPEALANLSGXEXXX 

II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 , 1 1 1 1 1 1 1 1 1 1 II 1 1 1 M 1 1 M I I M 1 1 1 1 1 M I 

orf 22 - 1 KKNPGWFTAPASGKIAAIHRGEKRVLQSWIAVEGNDE I EFERYAPEALANLSGEEVRR 

70 80 90 100 110 120 



45 



130 140 150 160 170 180 

orf 22a. pep NLIQSGLWTALRXRPFSKIPAVDAEPFAIFVNAMDTNPLAADPVWIKEAXXDFRRXXLV 

1 1 I 1 1 1 1 1 1 1 1 hi I M I M I 1 1 1 1 I M II I 1 1 1 1 1 1 1 1 I M-'-l 1 1 1 Ihl II 

orf 22-1 NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV 

130 140 150 160 170 180 



190 200 210 220 230 240 

orf 22a. pep LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 , 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 II 1 1 

orf 22 - 1 LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI 



CHIR-0160 (356.001) 



-160- 



PATENT 



190 



200 



210 



220 



230 



240 



250 260 270 280 290 300 

orf 22a . pep NYQDVIAIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDADNRVI 

M I i I h 1 1 1 1 i 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 i I M 1 1 1 1 1 II 1 1 1 1 1 1 M I h 1 1 1 1 

orf 22 - 1 NYQDVI T I GRLFATGRLNTERV I ALGGSQVNKPRLLRTVLGAKVSQ I TAGELVDTDNRVI 

250 ' 260 270 280 290 300 

310 320 330 340 350 360 

orf 22a . pep SGSVLNGAITQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFLKNK 
I I I I I II I I I II M I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I II I I 
orf 22 - 1 SGS VLNGAI TQGAHDYLGRYHNQ I S V I EEGRSKELFGWVAPQPDKYS I TRTTLGHFLKNK 

310 320 330 340 350 360 

370 380 390 400 410 420 

orf 22a . pep LFKFTTAVNGGDRAMVP IGTYERVMPLD I LPTLLLRDLI VGDTDSAQALGCLELDEEDLA 

I I I :| II I II I II I I I II I I I I I I I I I I I M I I I I I I II M II I I I I I I I I I I I II 
orf 22 - 1 LFKFNTAVNGGDRAMVP IGTYERVMPLD I LPTLLLRDLI VGDTDSAQALGCLELDEEDLA 

370 380 390 400 410 420 

430 440 
orf 22a. pep LCSFVCPGKYEXGPLLRKVLETXEKEGX 

IMIIIIIIII llllllllll Mill 
orf 22 - 1 LCSFVCPGKYEYGPLLRKVLETIEKEGX 

430 440 

Further work identified a partial gene sequence [<SEQ ID 129>] (SEP ID NO: 129) from 
N. gonorrhoeae, which encodes the following amino acid sequence [<SEQ ID 130; ORF22ng>] 
(SEP ID NO: 130; ORF22ng) : 



i 

51 
101 
151 
201 
251 
301 



MIKIKKGLNL 
VKKGQVLFED 
EFERYVPEAL 
VNAMDTNPLA 
SENAANIETH 
LFVTGRLNTE 
SGSVLNGAIA 



PIAGRPEQVI 
KKNPGWFTA 
AKLSSEKVRR 
ADPTVIIKEA 
EFGGPHPAGL 
RWALGGLQV 
QGAHDYLGRY 



YDGPAITEVA 
PASGKIAAIH 
NLIQSGLWTA 
AEDFKRGLLV 
SGTHIHFIEP 
NKPRLLRTVL 
HN* 



LLGEEYVGMR 
RGEKRVLQSV 
LRTRPFSKIP 
LSRLTERKIH 
VGANKTVWTI 
GAKVSQLTAG 



PSMKIKEGEA 
VIAVEGNDEI 
AVDAEPFAIF 
VCKAAGADVP 
NYQDVI AIGR 
ELVDADNRVI 



Further work identified complete gonococcal gene [<SEQ ID 131>] (SEP ID NO: 131) : 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 



ATGATTAAAA 
GCAAGTCATT 
AAGAATATGT 
GTCAAAAAAG 
ATTTACTGCG 
AGCGCGTACT 
GAGTTCGAAC 
AGTGCGCCGC 
GTCCGTTCAG 
GTCAATGCGA 
CAAAGAAGCC 
TGACCGAACG 
TCTGAAAATG 
TGCCGGCTTG 
ATAAAACCGT 
TTGTTCGTAA 
CCTGCAAGTC 



TCAAAAAAGG 
TATGACGGCC 
CGGCATGCGC 
GCCAAGTGCT 
CCGGCTTCAG 
TCAGTCAGTC 
GCTACGTACC 
AACCTGATTC 
CAAAATCCCT 
TGGACACCAA 
GCCGAAGACT 
TAAAATCCAT 
CTGCCAATAT 
AGTGGCACGC 
GTGGACCATC 
CAGGCCGTCT 
AACAAACCGC 



TCTAAATCTG 
CGGCCATTAC 
CCCTCGATGA 
GTTTGAAGAC 
GCAAAATCGC 
GTGATTGCCG 
TGAAGCGCTG 
AATCAGGCTT 
GCCGTAGATG 
TCCGCTGGCT 
TCAAACGCGG 
GTGTGTAAAG 
CGAAACACAT 
ACATTCATTT 
AATTATCAAG 
GAATACCGAG 
GCCTCTTGCG 



CCCATCGCGG 
CGAAGTCGCG 
AAATCAAGGA 
AAAAAGAATC 
CGCTATTCAC 
TTGAAGGCAA 
GCAAAATTGA 
ATGGACTGCG 
CCGAGCCGTT 
GCCGACCCTA 
CCTGTTGGTA 
CAGCAGGCGC 
GAATTTGGCG 
CATCGAGCCA 
ACGTGATTGC 
CGCGTGGTTG 
TACCGTTTTG 



GCAGACCGGA 
TTGCTTGGCG 
AGGTGAAGCC 
CGGGCGTAGT 
CGTGGCGAAA 
CGACGAAATC 
GCAGCGAAAA 
CTTCGCACCC 
CGCCATCTTC 
CGGTCATCAT 
TTGAGCCGCC 
AGACGTGCCG 
GCCCGCATCC 
GTCGGCGCGA 
TATCGGACGT 
CCTTGGGCGG 
GGTGCGAAGG 
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851 


TGTCTCAACT 


TACCGCCGGC 


GAATTGG1 TG 


ACGCGGALAA 




901 


TCCGGTTCGG 


TATTGAACGG 


TGCGATTGCA 


CAAGGCGCGC 


ATGATTATTT 


951 


GGGACGCTAC 


CACAATCAGA 


TTTCCGTTAT 


CGAAGAAGGC 


CGCAGCAAAG 


1001 


AGCTGTTCGG 


CTGGGTTGCG 


CCGCAGCCGG 


ACAAATACTC 


CATCACGCGC 


1051 


ACCACTCTCG 


GCCATTTCCT 


AAAAAACAAA 


CTCTTCAAGT 


TCACGACAGC 


1101 


CGTCAACGGC 


GGCGACCGCG 


CCATGGTACC 


GATCGGCACT 


TATGAGCGCG 


1151 


TAATGCCGTT 


GGACATCCTG 


CCTACCTTGC 


TTTTGCGCGA 


TTTAATCGTC 


1201 


GGCGATACCG 


ACAGCGCGCA 


GGCTTTGGGT 


TGCTTGGAAT 


TGGACGAAGA 


1251 


AGACCTCGCT 


TTGTGCAGCT 


TCGTCTGCCC 


GGGCAAATAC 


GAATACGGCC 


1301 


CGCTGTTGCG 


CAAAGTGCTG 


GAAACCATTG 


AGAAGGAAGG 


CTGA 


This encodes a protein having amino acid sequence 


[<SEQ ID 1 


32: ORF22n2-l>l fSEO ID NO: 


132: ORF22ne-l): 










i 


MIKIKKGLNL 


PIAGRPEQVI 


YDGPAITEVA 


LLGEEYVGMR 


PSMKIKEGEA 


51 


VKKGQVLFED 


KKNPGWFTA 


PASGKIAAIH 


RGEKRVLQSV 


VIAVEGNDEI 


101 


EFERYVPEAL 


AKLSSEKVRR 


NLIQSGLWTA 


LRTRPFSKIP 


AVDAEPFAIF 


151 


VNAMDTNPLA 


ADPTVIIKEA 


AEDFKRGLLV 


LSRLTERKIH 


VCKAAGADVP 


201 


SENAANIETH 


EFGGPHPAGL 


SGTHIHFIEP 


VGANKTVWTI 


NYQDVIAIGR 


251 


LFVTGRLNTE 


RWALGGLQV 


NKPRLLRTVL 


GAKVSQLTAG 


ELVDADNRVI 


301 


SGSVLNGAIA 


QGAHDYLGRY 


HNQISVIEEG 


RSKELFGWVA 


PQPDKYSITR 


351 


TTLGHFLKNK 


LFKFTTAVNG 


GDRAMVPIGT 


YERVMPLDIL 


PTLLLRDLIV 


401 


GDTDSAQALG 


CLELDEEDLA 


LCSFVCPGKY 


EYGPLLRKVL 


ETIEKEG* 



15 



20 



The originally-identified partial strain B sequence (ORF22) (SEP ID NO: 124) shows 93.7% 
identity over a 158aa overlap with ORF22ng (SEP ID NO: 130) : 

25 orf22.pep MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED 60 

MM MM llllllll MM lllllll IIIIIIIIMIIIIIIMIMII Illlllll 

or f 2 2 ng M I KI KKGLNLP I AGRPEQVI YDGPAITEVALLGEEYVGMRPSMKI KEGEAVKKGQVLFED 6 0 

orf22.pep KKNPGWFTAPASGKIAAIHRGEKRVLQSWIAVEXNDEIEFERYAPEALANLSGEEVRR 120 

I I MUM Mill I II MM III 1 1 II M 1 1 III llllllllhllllhlhhIM 

30 orf22ng KKNPGWFTAPASGKIAAIHRGEKRVLQSWIAVEGNDEIEFERYVPEALAKLSSEKVRR 120 

orf22.pep NL I QS GLWTALRTRP FS KI PAVDAE PFA I FVNAMDTNP 158 

I I I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II ! 1 1 1 1 1 1 1 M II 

orf 22ng NL I QS GLWTALRTRP FSK I PAVDAE PFA I FVNAMDTNPLAADPTV I IKE AAEDFKRGLLV 180 

The complete sequences from strain B (ORF22-1) (SEP ID NO: 126) and gonococcus 
35 [(ORF22ng)] (ORF22ng-n (SEP ID NO: 132) show 96.2% identity in 447 aa overlap: 

10 20 30 40 50 60 

or f 22 - 1 . pep MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED 
I I I I I I I II I I I I I I I I I I I I I I I I I I I I I h I I I I I hi I M II II I II I I I 
orf22ng-l M I KI KKGLNLP IAGRPEQV I YDGPAITEVALLGEEYVGMRPSMKI KEGEAVKKGQVLFED 

40 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 22 - 1 . pep ' KKNPGWFTAPASGKIAAIHRGEKRVLQSWIAVEGNDEIEFERYAPEALANLSGEEVRR 

lllllllllllllll M llllllllllllllllllllll Ihllllhll-'hll 

or f 2 2 ng - 1 KKNPGWFTAPASGKI AAI HRGEKRVLQS WI AVEGNDE I EFERYVPEALAKLSSEKVRR 
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70 



80 



90 



100 



110 



120 



130 140 150 160 170 180 

orf 22 - 1 . pep NLIQSGLWTALRTRPFSKI PAVDAEPFAI FVNAMDTNPLAADPTVI I KEAAEDFKRGLLV 

1 1 1 M I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! I M 1 1 1 1 1 1 1 M I ! II 1 1 1 1 M 1 1 1 1 1 1 II 

orf 22ng-l NLIQSGLWTALRTRPFSKI PAVDAEPFAI FVNAMDTNPLAADPTVI I KEAAEDFKRGLLV 

130 140 150 160 170 180 



10 



15 



190 200 210 220 230 240 

orf 22 - 1 . pep LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI 

IIIIIIIIIMIIIII IIIIIIIMIIIIIIII lillllll llllllll IIMM 

orf22ng-l LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 22 - 1 . pep NYQDVITIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDTDNRVI 

Illllhllllhllllllllhllll lllllllllllllllllhllllllhlllll 
orf22ng-l NYQDVIAIGRLFVTGRLNTERWALGGLQVNKPRLLRTVLGAKVSQLTAGELVDADNRVI 

250 260 270 280 290 300 



20 



310 320 330 340 350 360 

orf 22 - 1 . pep SGS VLNGAI TQGAHDYLGRYHNQ I S V I EEGRS KELFGWVAPQPDKYS I TRTTLGHFLKNK 
M I I III I h I I I I II I I I I I I I M I I I I I I M I I II I I I II , I I I I I II I .1 I I I I I I 
orf22ng-l SGSVLNGAIAQGAHDYLGRYHNQ I SVI EEGRS KELFGWVAPQPDKYS I TRTTLGHFLKNK 

310 320 330 340 350 360 



25 



370 380 390 400 410 420 

orf 22 - 1 . pep LFKFNTAVNGGDRAMVPIGTYERVMPLDILPTLLLRDLIVGDTDSAQALGCLELDEEDLA 

1 1 1 i : 1 1 1 1 1 1 M 1 1 1 1 1 1 ! L 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 S 1 1 1 1 1 1 1 ! 1 1 ! 1 1 1 1 1 1 M 1 1 1 1 

or f 2 2 ng - 1 LFKFTTAVNGGDRAMVP IGTYERVMPLDI LPTLLLRDLI VGDTDSAQALGCLELDEEDLA 

370 380 390 400 410 420 



430 440 
orf 22 - 1 . pep LCSFVCPGKYEYGPLLRKVLETIEKEGX 
! I I I I I I I I I M I I I II I I I II I I I I 
30 or f 2 2 ng - 1 LCS FVCPGKYEYGPLLRKVLET I EKEGX 

430 440 

Computer analysis of these sequences gave the following results: 



35 



Homology with 48kDa outer membrane protein of Actinobacillus pleuropneumoniae (accession 
number U24492) (SEP ID NO: 1 123), 



ORF22 (SEP ID NO: 124) and this 48kDa protein (SEP ID NO: 1123) show 72% aa identity in 
158aa overlap: 



40 



Orf22 


1 


4 8kDa 


1 


orf22 


61 


48kDa 


61 



MIKIKKGLNLPI AGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED 6 0 

MI IKKGL+LPIAG P Q +++G + EVA+LGEEY GMRPSMKV+EGD VKKGQVLFED 

M I T I KKGLDLP I AGTPAQV I HNGNTVNEVAMLGEE YVGMRPSMKVREGDWKKGQVLFED 60 

KKNPGWFTAPASGKIAAIHRGEKRVLQSWIAVEXNDEIEFERYAPEALANLSGEEVRR 120 
KKNPGWFTAPASG + I +RGEKRVLQS WI VE + + + I F RY LA+LS E+V+ + 

KKNPGWFTAPASGTWTINRGEKRVLQSWIKVEGDEQITFTRYEAAQLASLSAEQVKQ 120 
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orf22 121 NL I QS GLWT ALRTRP FS K I P A VD AE P F A I FVNAMDTN P 158 

NLI+SGLWTA RTRPFSK+ PA+DA P + I FVNAMDTN P 
48kDa 121 NLIESGLWTAFRTRPFSKVPALDAIPSSIFVNAMDTNP 158 

ORF22a (SEP ID NO: 128) also shows homology to the 48kDa Actinobacillus pleuropneumoniae 
5 protein (SEP ID NO: 1123) : 

gi | 1185395 (U244 92) 4 8 kDa outer membrane protein [Actinobacillus pleuropneumoniae] 
Length =44 9 

Score = 530 bits (1351) , Expect = e-150 

Identities = 274/450 (60%), Positives = 323/450 (70%), Gaps = 4/450 (0%) 



10 Query: 1 MIKIKKGLNLPIAGRPEQVIYDGPVITEVALLGEEYAGMRPXMKVKEGDAVKKGQVLFED 60 

MI IKKGL+LPIAG P QVI++G + EVA+LGEEY GMRP MKV+EGD VKKGQVLFED 
Sbjct: 1 MITIKKGLDLPIAGTPAQVIHNGNTVNEVAMLGEEYVGMRPSMKVREGDWKKGQVLFED 60 

Query: 61 KKXPGWFTAPVSGKI AAIHRGEKRVLQSWI AVEGNDE I EFERYAPEALANLSGXEXXX 120 
KK PGWFTAP SG + I +RGEKRVLQSWI VEG+++I F RY LA+LS + 
15 Sbjct: 61 KKNPGWFTAPASGTVVTINRGEKRVLQSVVIKVEGDEQITFTRYEAAQLASLSAEQVKQ 12 0 

Query: 121 NLIQSGLWTALRXRPFSKIPAVDAEPFAIFVNAMDTNPLAADPVWIKEAXXDFRRXXLV 180 

NLI+SGLWTA R RPFSK+PA+DA P +IFVNAMDTNPLAADP W+KE DF+ V 
Sbjct: 121 NL I ESGLWTAFRTRP FS KVP ALDA I PS S I FVNAMDTNPLAADPEVVLKEYETDFKDGLTV 180 

Query: 181 LSRL- -TERKIHVCKAAGADVP-SENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTV 237 
20 L+RL ++ + ++CK A + ++P S I F G HPAGL GTH I HF + + P VGA K V 

Sbjct: 181 LTRLFNGQKPVYLCKDADSNI PLSPAIEGITI KSFSGVHPAGLVGTHIHFVDPVGATKQV 240 

Query: 23 8 WTINYQDVIAIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDADN 2 97 

W +NYQDVIAIG+LF TG L T+R+I+L G QV PRL+RT LGA +SQ+TA EL +N 
Sbjct: 241 WHLNYQDVIAIGKLFTTGELFTDRIISLAGPQVKNPRLVRTRLGANLSQLTANELNAGEN 3 00 

25 Query: 298 RVISGSVLNGAITQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFL 357 

RVISGSVL+GA G DYLGRY Q+SV+ EGR KELFGW+ P DK+SITRT LGHF 
Sbjct: 301 RVISGSVLSGATAAGPVDYLGRYALQVSVLAEGREKELFGWIMPGSDKFSITRTVLGHFG 360 

Query: 358 KNKLFKFTTAVNGGDRAMVPIGTYERVMXXXXXXXXXXXXXXVGDTDSAQXXXXXXXXXX 417 
K KLF FTTAV+GG+RAMVPIG YERVM GDTDSAQ 
30 Sbjct: 361 K-KLFNFTTAVHGGERAMVPIGAYERVMPLDIIPTLLLRDLAAGDTDSAQNLGCLELDEE 419 



Query: 418 XXXXXS FVCPGKYEXGPLLRKVLETXEKEG 447 
++VCPGK GP+LR LE EKEG 

PRF22ng-l (SEP ID NP: 132) also shows homology with the PMP (SEP ID NP: 1123) from 
A.pleuropneumoniae: 



35 gi | 1185395 (U24492) 48 kDa outer membrane protein [Actinobacillus 

pleuropneumoniae] Length =44 9 
Score = 555 bits (1414), Expect = e-157 

Identities = 284/450 (63%), Positives = 337/450 (74%), Gaps = 4/450 (0%) 



Query : 27 MIKIKKGLNLPIAGRPEQVIYDGPAITEVALLGEEYVGMRPSMKIKEGEAVKKGQVLFED 86 
40 MI IKKGL+LPIAG P QVI++G + EVA+LGEEYVGMRPSMK++EG+ VKKGQVLFED 

Sbjct: 1 MITIKKGLDLPIAGTPAQVIHNGNTVNEVAMLGEEYVGMRPSMKVREGDWKKGQVLFED 60 
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15 



Query: 


87 


Sbjct: 


61 


Query: 


147 


Sbjct: 


121 


Query: 


207 


Sbjct: 


181 


Query: 


264 


Sbjct: 


241 


Query: 


324 


Sbjct: 


301 


Query: 


384 


Sbjct: 


361 


Query: 


444 


Sbjct : 


420 



KKNPGWFTAPASG + I+RGEKRVLQSWI VEG++ + I F RY LA LS+E+V+ + 



NLI+SGLWTA RTRPFSK+PA+DA P + I FVNAMDTNPLAADP V++KE DFK GL V 



L+RL ++ +++CK A +++P S I F G HPAGL GTHIHF++PVGA K V 



W +NYQDVIAIG+LF TG L T+R+++L G QV PRL+RT LGA +SQLTA EL +N 



RVISGSVL+GA A G DYLGRY Q+SV+ EGR KELFGW+ P DK+SITRT LGHF 



K KLF FTTAV+GG+RAMVPIG YERVM GDTDSAQ 



20 ++VCPGK YGP+LR LE IEKEG 

Sbjct: 420 DLALCTYVCPGKNNYGPMLRAALEKIEKEG 449 

Based on this analysis, including the homology with the outer membrane protein (SEP ID NO: 
1123) of Actinobacillus pleuropneumonias, it was predicted that these proteins from N. meningitidis 
and N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
25 raising antibodies. 

ORF22-1 (SEP ID NO: 126) (35.4kDa) was cloned in pET and pGex vectors and expressed in 
E.coli, as described above. The products of protein expression and purification were analyzed by 
SDS-PAGE. Figure 5A shows the results of affinity purification of the GST-fusion protein, and 
Figure 5B shows the results of expression of the His-fusion in E.coli. Purified GST- fusion protein 
30 was used to immunise mice, whose sera were used for ELISA (positive result) and FACS analysis 
(Figure 5C). These experiments confirm that ORF22-1 (SEP ID NO: 126) is a surface-exposed 
protein, and that it is a useful immunogen. 

Example 16 



The following partial DNA sequence was identified in N .meningitidis [<SEQ ID 133>] (SEP ID 
35 NP: 133) : 
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1 . . GCGnCGnAAA TCATCCATCC CC . . nACGTC GTAGGCCCTG AAGCCAACTG 

51 GTTTTTTATG GTAGCCAGTA CGTTTGTGAT TGCTTTGATT GGTTATTTTG 

101 TTACTGAAAA AATCGTCGAA CCGCAATTGG GCCCTTATCA ATCAGATTTG 

151 TCACAAGAAG AAAAAGACAT TCGGCATTCC AATGAAATCA CGCCTTTGGA 

201 ATATAAAGGA TTAATTTGGG CTGGCGTGGT GTTTGTTGCC TTATCCGCCC 

251 TATTGGCTTG GAGCATCGTC CCTGCCGACG GTATTTTGCG TCATCCTGAA 

301 ACAGGATTGG TTTCCGGTTC GCCGTTTTTA AAATCGATTG TTGTTTTTAT 

351 TTTCTTGTTG TTTGCACTGC CGGGCATTGT TTATGGCCGG GTAACCCGAA 

4 01 GTTTGCGCGG CGAACAGGAA GTCGTTAATG CGmyGGCCGA ATCGATGAGT 

4 51 ACTCTGGsGC TTTmTTTGsw CAkcATCTTT TTTGCCGCAC AGTTTGTCGC 

501 ATTTTTTAAT TGGACGAATA TTGGGCAATA TATTGCCGTT AAAGGGGCGA 

551 CGTTCTTAAA AGAAGTCGGC TTGGGCGGCA GCGTGTTGTT TATCGGTTTT 

601 ATTTTAATTT GTGCTTTTAT CAATCTGATG ATAGGCTCCG CCTCCGCGCA 

651 ATGGGCGGTA ACTGCGCCGA TTTTCGTCCC TATGCTGATG TTGGCCGGCT 

701 ACGCGCCCGA AGTCATTCAA GCCGCTTACC GCATCGGTGA TTCCGTTACC 

751 AATATTATTA CGCCGATGAT GAGTTATTTC GGGCTGATTA TGGCGACGGT 

801 GrkCmmmTAC AAAAAAGATG CGGGCGTGGG TaCGcTGATT wCTATGATGT 

851 TGCCGTATTC CGCTTTCTTC TTGATTGCgT GGATTGCCTT ATTCTGCATT 

901 TGGGTATTTg TTTTGGGCCT GCCCGTCGGT CCCGGCGCGC CCACATTCTA 

951 TCCCGCACCT TAA 

This corresponds to the amino acid sequence [<SEQ ID 134; ORF12>] (SEP ID NO: 134; 
ORF12) : 



1 . .AXXIIHPXXV VGPEANWFFM VASTFVIALI GYFVTEKIVE PQLGPYQSDL 

51 SQEEKDIRHS NEITPLEYKG LIWAGWFVA LSALLAWSIV PADGILRHPE 

101 TGLVSGSPFL KSIWFIFLL FALPGIVYGR VTRSLRGEQE WNAXAESMS 

151 TLXLXLXXIF FAAQFVAFFN WTNIGQYIAV KGATFLKEVG LGGSVLFIGF 

201 ILICAFINLM IGSASAQWAV TAPIFVPMLM LAGYAPEVIQ AAYRIGDSVT 

251 NIITPMMSYF GLIMATVXXY KKDAGVGTLI XMMLPYSAFF LIAWIALFCI 

301 WVFVLGLPVG PGAPTFYPAP * 

Further sequence analysis revealed the complete DNA sequence [<SEQ ID 135>] (SEP ID NO: 
135} to be: 



1 ATGAGTCAAA CCGATACGCA ACGGGACGGA CGATTTTTAC GCACAGTCGA 

51 ATGGCTGGGC AATATGTTGC CGCATCCGGT TACGCTTTTT ATTATTTTCA 

101 TTGTGTTATT GCTGATTGCC TCTGCCGTCG GTGCGTATTT CGGACTATCC 

151 GTCCCCGATC CGCGCCCTGT TGGTGCGAAA GGACGTGCCG ATGACGGTTT 

201 GATTTACATT GTCAGCCTGC TCAATGCCGA CGGTTTTATC AAAATCCTGA 

251 CGCATACCGT TAAAAATTTC ACCGGTTTCG CGGCGTTGGG AACGGTGTTG 

301 GTTTCTTTAT TGGGCGTGGG GATTGCGGAA AAATCGGGCT TGATTTCCGC 

3 51 ATTAATGCGC TTATTGCTCA CAAAATCGCC ACGCAAACTC ACTACTTTTA 

4 01 TGGTTGTTTT TACAGGGATT TTATCTAATA CCGCTTCTGA ATTGGGCTAT 
451 GTCGTCCTAA TCCCTTTGTC CGCCATCATC TTTCATTCCC TCGGCCGCCA 
501 TCCGCTTGCC GGTCTGGCTG CGGCTTTCGC CGGCGTTTCG GGCGGTTATT 
551 CGGCCAATCT GTTCTTAGGC ACAATCGATC CGCTCTTGGC AGGCATCACC 
601 CAACAGGCGG CGCAAATCAT CCATCCCGAC TACGTCGTAG GCCCTGAAGC 
651 CAACTGGTTT TTTATGGTAG CCAGTACGTT TGTGATTGCT TTGATTGGTT 
701 ATTTTGTTAC TGAAAAAATC GTCGAACCGC AATTGGGCCC TTATCAATCA 
751 GATTTGTCAC AAGAAGAAAA AGACATTCGG CATTCCAATG AAATCACGCC 
801 TTTGGAATAT AAAGGATTAA TTTGGGCTGG CGTGGTGTTT GTTGCCTTAT 
851 CCGCCCTATT GGCTTGGAGC ATCGTCCCTG CCGACGGTAT TTTGCGTCAT 
901 CCTGAAACAG GATTGGTTTC CGGTTCGCCG TTTTTAAAAT CGATTGTTGT 
951 TTTTATTTTC TTGTTGTTTG CACTGCCGGG CATTGTTTAT GGCCGGGTAA 

1001 CCCGAAGTTT GCGCGGCGAA CAGGAAGTCG TTAATGCGAT GGCCGAATCG 

1051 ATGAGTACTC TGGGGCTTTA TTTGGTCATC ATCTTTTTTG CCGCACAGTT 
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1101 TGTCGCATTT TTTAATTGGA CGAATATTGG GCAATATATT GCCGTTAAAG 

1151 GGGCGACGTT CTTAAAAGAA GTCGGCTTGG GCGGCAGCGT GTTGTTTATC 

1201 GGTTTTATTT TAATTTGTGC TTTTATCAAT CTGATGATAG GCTCCGCCTC 

1251 CGCGCAATGG GCGGTAACTG CGCCGATTTT CGTCCCTATG CTGATGTTGG 

13 01 CCGGCTACGC GCCCGAAGTC ATTCAAGCCG CTTACCGCAT CGGTGATTCC 
1351 GTTACCAATA TTATTACGCC GATGATGAGT TATTTCGGGC TGATTATGGC 
14*01 GACGGTGATC AAATACAAAA AAGATGCGGG CGTGGGTACG CTGATTTCTA 

14 51 TGATGTTGCC GTATTCCGCT TTCTTCTTGA TTGCGTGGAT TGCCTTATTC 
1501 TGCATTTGGG TATTTGTTTT GGGCCTGCCC GTCGGTCCCG GCGCGCCCAC 
1551 ATTCTATCCC GCACCTTAA 



This corresponds to the amino acid sequence [<SEQ ID 136; ORF12-l>] (SEP ID NO: 136: 
ORF12-1) : 

1 MSQTDTQRDG RFLRTVEWLG NMLPHP VTLF IIFIVLLLIA SAV GAYFGLS 

51 VPDPRPVGAK GRADDG LIYI VSLLNADGFI KIL THTVKNF TG FAPLGTVL 

101 VSLLGVGIA E KSGLISALMR LLLTKSPRKL TTFMWFTGI LSNTASE LGY 

151 WLIPLSAII FHSL GRHPLA GLAAAFAGVS GGYSANLFLG TIDPLLAGIT 

201 QQAAQIIHPD YWGPEANW F FMVASTFVIA LIGYFV TEKI VEPQLGPYQS 

251 DLSQEEKDIR HSNEITPLEY KGLIW AGWF VALSALLAWS IV PADGILRH 

301 PETGLVSGSP FLKS IWFIF LLFALPGIVY G RVTRSLRGE QEWNAMAES 

351 MST LGLYLVI IFFAAQFVAF FNWTNIGQYI AVKGATFLKE VGLGGSVLFI 

4 01 GFILICAFIN LMI GSASAQW AVTAPIFVPM LMLAGYA PEV IQAAYRIGDS 

4 51 VTN IITPMMS YFGLIMATVI KYKKDAGVGT LISMMLPYSA FFLIAWIALF 

501 CIWVFVLGLP VGPGAPTFYP AP* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF12 (SEP ID NO: 134) shows 96.3% identity over a 320aa overlap with an ORF (ORF12a) 
(SEP ID NO: 138) from strain A of N. meningitidis: 



10 20 30 

orfl2 pep AXXIIHPXXWGPEANWFFMVASTFVIALI 

I I I I I MINIMI I I I I 
orf 12a AAAFAGVSGGYSANLFLGTIDPLLAGITQQAAQI IHPDYWGPEANWFFMVASTFVIALI 

180 190 200 210 220 230 



40 50 60 70 80 90 

orf 12 . pep GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGWFVALSALLAWSIV 

MMIMMMMMIMIMMMMIMMMMM MNMI I I Ml I 

orf 12a GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGWFVALSALLAWSIV 
240 250 260 270 280 290 



100 110 120 130 140 150 

orf 12 . pep PADGILRHPETGLVSGSPFLKSIWFIFLLFALPGIVYGRVTRSLRGEQEWNAXAESMS 

MIMIMMMMM MMMMMMIMMIMM I I Ml Mill Mill 

orf 12a PADGILRHPETGLVSGSPFLKSIWFIFLLFALPGIVYGRVTRSLRGEQEWNAMAESMS 
300 310 320 330 340 350 
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160 170 180 190 200 210 

orf 12 .pep TLXLXLXXIFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLPIGFILICAFINLM 

II I I I II 1 1 1 M 1 1 1 1 1 1 II MM II 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 M 1 1 1 1 1 1 1 

orf 12a TLGL YL V 1 1 F FAAQ F VAF FNWTN I GQ Y I AVKGAT FLKE VGLGGS VL F I G F I L I C AF I NLM 

360 370 380 390 400 410 

220 230 240 250 260 270 

orf 12 .pep I GS AS AQWAVTAP I FVPMLMLAGYAPEV I QAAYR I GDS VTN I I TPMMS YFGL IMATVXX Y 

1 1 1 1 1 1 1 II 1 1! M 1 1 1 1 II I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mil 1 1 1 M 1 1 II 1 1 1 1 1 I 

orf 12a IGSASAQWAVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKY 
420 430 440 450 460 470 

280 290 300 310 320 

orf 12 . pep KKDAGVGTLIXMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX 

Illlllllll 1 1 M 1 1 1 1 1 1 1 1 1 1 M MM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 12a KKDAGVGTLISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX 
480 490 500 510 520 

The complete length ORF12a nucleotide sequence [<SEQ ID 137>] (SEP ID NO: 137) is 

1 ATGAGTCAAA CCGATACGCA ACGGGACGGA CGATTTTTAC GCACAGTCGA 

51 ATGGCTGGGC AATATGTTGC CGCACCCGGT TACGCTTTTT ATTATTTTCA 

101 TTGTGTTATT GCTGATTGCC TCTGCCGCCG GTGCGTATTT CGGACTATCC 

151 GTCCCCGATC CGCGCCCTGT TGGTGCGAAA GGACGTGCCG ATGACGGTTT 

201 GATTCACGTT GTCAGCCTGC TCGATGCTGA CGGTTTGATC AAAATCCTGA 

251 CGCATACCGT TAAAAATTTC ACCGGTTTCG CGCCGTTGGG AACGGTGTTG 

301 GTTTCTTTAT TGGGCGTGGG GATTGCGGAA AAATCGGGCT TGATTTCCGC 

351 ATTAATGCGC TTATTGCTCA CAAAATCTCC ACGCAAACTC ACTACTTTTA 

4 01 TGGTTGTTTT TACAGGGATT TTATCTAATA CCGCTTCTGA ATTGGGCTAT 

4 51 GTCGTCCTAA TCCCTTTGTC CGCCATCATC TTTCATTCCC TCGGCCGCCA 

501 TCCGCTTGCC GGTCTGGCTG CGGCTTTCGC CGGCGTTTCG GGCGGTTATT 

551 CGGCCAATCT GTTCTTAGGC ACAATCGATC CGCTCTTGGC AGGCATCACC 

601 CAACAGGCGG CGCAAATCAT CCATCCCGAC TACGTCGTAG GCCCTGAAGC 

651 CAACTGGTTT TTTATGGTAG CCAGTACGTT TGTGATTGCT TTGATTGGTT 

701 ATTTTGTTAC TGAAAAAATC GTCGAACCGC AATTGGGCCC TTATCAATCA 

751 GATTTGTCAC AAGAAGAAAA AGACATTCGA CATTCCAATG AAATCACGCC 

801 TTTGGAATAT AAAGGATTAA TTTGGGCTGG CGTGGTGTTT GTTGCCTTAT 

851 CCGCCCTATT GGCTTGGAGC ATCGTCCCTG CCGACGGTAT TTTGCGTCAT 

901 CCTGAAACAG GATTGGTTTC CGGTTCGCCG TTTTTAAAAT CAATTGTTGT 

951 TTTTATTTTC TTGTTGTTTG CACTGCCGGG CATTGTTTAT GGCCGGGTAA 

1001 CCCGAAGTTT GCGCGGCGAA CAGGAAGTCG TTAATGCGAT GGCCGAATCG 

1051 ATGAGTACTC TGGGGCTTTA TTTGGTCATC ATCTTTTTTG CCGCACAGTT 

1101 TGTCGCATTT TTTAATTGGA CGAATATTGG GCAATATATT GCCGTTAAAG 

1151 GGGCGACGTT CTTAAAAGAA GTCGGCTTGG GCGGCAGCGT GTTGTTTATC 

12 01 GGTTTTATTT TAATTTGTGC TTTTATCAAT CTGATGATAG GCTCCGCCTC 

12 51 CGCGCAATGG GCGGTAACTG CGCCGATTTT CGTCCCTATG CTGATGTTGG 

1301 CCGGCTACGC GCCCGAAGTC ATTCAAGCCG CTTACCGCAT CGGTGATTCC 

1351 GTTACCAATA TTATTACGCC GATGATGAGT TATTTCGGGC TGATTATGGC 

1401 GACGGTGATC AAATACAAAA AAGATGCGGG CGTGGGTACG CTGATTTCTA 

1451 TGATGTTGCC GTATTCCGCT TTCTTCTTGA TTGCGTGGAT TGCCTTATTC 

1501 TGCATTTGGG TATTTGTTTT GGGCCTGCCC GTCGGTCCCG GCGCGCCCAC 

1551 ATTCTATCCC GCACCTTAA 

This encodes a protein having amino acid sequence [<SEQ ID 138>] (SEP ID NO: 138) : 



1 MSQTDTQRDG RFLRTVEWLG NMLPHP VTLF IIFIVLLLIA SAA GAYFGLS 
51 VPDPRPVGAK GRADDG LIHV VSLLDADGLI KIL THTVKNF TGFAPLGTVL 
101 VSLLGVGIA E KSGLISALMR LLLTKSPRKL TTFMWFTGI LSNTASE LGY 
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151 
201 
251 
301 
351 
401 
451 
501 



WLIPLSAII FHSL GRHPLA 
QQAAQIIHPD YWG P E ANW F_ 
DLSQEEKDIR HSNEITPLEY 
PETGLVSGSP FLKSIWFIF 



GLAAAFAGVS GGYSANLFLG 
FMVASTFVIA LIGYFV TEKI 
KGLIW AGWF VALSALLAWS 
LLFALPGIVY GRVTRSLRGE 



MST LGLYLVI IFFAAQFVAF 
GFILICAFIN LMI GSASAQW 
VTN IITPMMS YFGLIMATVI 
CIWVFVLGLP VGPGAPTFYP 



FNWTNIGQYI AVKGATFLKE 
AVTAPIFVPM LMLAGYA PEV 
KYKKDAGVGT LISMMLPYSA 
AP* 



TIDPLLAGIT 
VEPQLGPYQS 
IVPADGILRH 
QEWNAMAES 
VGLGGSVLFI 
IQAAYRIGDS 
FFLIAWIALF 



10 



ORF12a (SEP ID NO: 138) and ORF12-1 (SEP ID NO: 136) show 99.0% identity in 522 aa 
overlap: 



15 



10 20 30 40 50 60 

orf 12a. pep MSQTDTQRDGRFLRTVEWLGNMLPHPVTLFIIFIVLLLIASAAGAYFGLSVPDPRPVGAK 

I MM 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II! II 1 1 1 1 1 M I M M I M I M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 

orf 12-1 MSQTDTQRDGRFLRTVEWLGNMLPHPVTLF I I F I VLLLI AS AVGAYFGLSVPDPRPVGAK 

10 20 30 40 50 60 



20 



70 80 90 100 110 120 

orf 12a . pep GRADDGLIHWSLLDADGLIKILTHTVKNFTGFAPLGTVLVSLLGVGIAEKSGLISALMR 

I I I I I I I I : = I II h I I h I I I I II I I I II I I I I I I I II I I I II I I I I I I I I I I M I I I I 
orf 12-1 GRADDGL I Y I VSLLNADGF I KI LTHTVKNFTGFAPLGTVLVSLLGVG I AEKSGL I S ALMR 

70 80 90 100 110 120 



25 



130 140 150 160 170 180 

orf 12a. pep LLLTKSPRKLTTFMVVFTGILSNTASELGYVVLIPLSAIIFHSLGRHPLAGLAAAFAGVS 

I 1 1 . 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 M 1 1 II I > 1 1 1 1 1 1 I Ml 1 1 1 

or f 1 2 - 1 LLLTKS PRKLTTFMWFTG I LSNTASELGYWL I PLSAI I FHS LGRHPLAGLAAAFAGVS 

130 140 150 160 170 180 



30 



190 200 210 220 230 240 

orf 12a. pep GGYSANLFLGTIDPLLAGITQQAAQIIHPDYWGPEANWFFMVASTFVIALIGYFVTEKI 

I I I I I I I I I I I I I I I I I I ' I I I 

orf 12-1 GGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMVASTFVIALIGYFVTEKI 

190 200 210 220 230 240 



35 



250 260 270 280 290 300 

orf 12a. pep VE PQLGPYQSDLSQEEKD I RHSNE I TPLEYKGL I WAGWFVALSALLAWS IVPADGILRH 

I I I I I I I I I I I II I I I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I M 
orf 12-1 VE PQLGPYQSDLSQEEKD I RHSNE I TPLEYKGL I WAGWFVALSALLAWS IVPADGILRH 

250 260 270 280 290 300 



40 



310 320 330 340 350 360 

orf 12a. pep PETGLVSGSPFLKSIWFIFLLFALPGIVYGRVTRSLRGEQEWNAMAESMSTLGLYLVI 

I I I I I I i I I I I I I I ' I I I I I I I I I I I I I I I 

orf 12 - 1 PETGLVSGSPFLKSI WFIFLLFALPGIVYGRVTRSLRGEQEWNAMAESMSTLGLYLVI 

310 . 320 330 340 350 360 



45 



370 380 390 400 410 420 

orf 12a. pep IFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLMIGSASAQW 

MIMIMI MIMI MIMI MMMMMMMMMMMMMIIIMMM 

orf 12-1 I FFAAQFVAFFNWTNI GQY I AVKGATFLKEVGLGGS VLF I GF I L I CAF INLM I GS AS AQW 

370 380 390 400 410 420 



430 440 450 460 470 480 

orf 12a. pep AVTAP I FVPMLMLAGYAPEVIQAAYRIGDS VTNIITPMMS YFGLIMATVI KYKKDAGVGT 

1 1 MINI 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 ! ! I 
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orf 12-1 AVTAPI FVPMLMLAGYAPEVIQAAYRIGDSVTNI ITPMMSYFGLIMATVIKYKKDAGVGT 

430 440 450 460 470 480 

490 500 510 520 

orf 12a . pep LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX 

5 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 I M 1 1 1 1 II 1 1 1 1 1 1 1 1! 1 1 1 1 1 

orf 12-1 LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX 

490 500 510 520 

Homology with a predicted ORF from N. gonorrhoeae 

ORF12 (SEP ID NO: 134) shows 92.5% identity over a 320aa overlap with a predicted ORF 
1 0 (ORF1 2.ng) (SEP ID NO: 140) from N. gonorrhoeae: 

orf 12 .pep AXXIIHPXXWGPEANWFFMVASTFVIALI 30 

I I I I I Mill IMIMIIIIIMI 
orf 12ng AAAFAGVSGGYSANLFLGTIDPLLAGITQQAAQIIHPDYWGPEANWFFMAASTFVIALI 232 

orf 12 .pep GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGWFVALSALLAWSIV 90 

15 | MM || || Ml Illlllllll Illlllllll II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I Ml M I 

orf 12ng GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGWFVALSALLAWSIV 2 92 

orf 12 .pep PADGILRHPETGLVSGSPFLKSIWFIFLLFALPGIVYGRVTRSLRGEQEWNAXAESMS 150 

1 1 1 1 1 1 M 1 1 1 1 Ml I M 1 1 1 1 1 1 1 M II I II II 1 1 1 M 1 1 M MM 1 1 1 1 Mill 

orf 12ng PADGILRHPETGLVAGSPFLKSIWFIFLLFALPGIVYGRITRSLRGEREWNAMAESMS 352 

20 orf 12 .pep TLXLXLXXIFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLM 210 

II I I I II 1 1 M I M I II I M 1 1 1 1 1 1 M M I M I IIIIIIIIIIIMIII 

orf 12ng TLGLYLVIIFFAAQFVAFFNWTNIGQYIAVKGAVFLKKFRLGGSVLFIGFILICAFINLM 412 

orf 12 .pep IGSASAQWAVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVXXY 270 

1 1 1 1 1 1 1 II 1 1 1 1 M II 1 1 1 1 M I h 1 1 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II I M I I 

25 orf 12ng IGSASAQWAVTAPIFVPMLMLAGNAPQVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKY 472 

orf 12 .pep KKDAGVGTLIXMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAP 320 

Illlllllll 1 1 1 1 1 1 I I 1 I I I 1 I 1 1 I I I I I I 1 I E I 1 I I I 1 : 1 1 1 ! I = ! 
orf 12ng KKDAGVGTLISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGTPTFYPVP 522 

30 The complete length PRF12ng nucleotide sequence [<SEQ ID 139>] fSEPIDNP: 139) is: 

1 ATGAGTCAAA CCGACGCGCG TCGTAGCGGA CGATTTTTAC GCACAGTCGA 

51 ATGGCTGGGC AATATGTTGC CGCACCCGGT TACGCTTTTT ATTATTTTCA 

101 TTGTGTTATT GCTGATTGcc tctgCCGTCG GTGCGTATTT CGGACTATCC 

151 GTCCCCGATC CGCGTCCTGT TGGGGCGAAA GGACGTGCCG ATGACGGTTT 

35 201 GATTCACGTT GTCAGCCTGC TCGATGCCGA CGGTTTGATC AAAATCCTGA 

251 CGCATACCGT TAAAAATTTC ACCGGTTTCG CGCCGTTGGG AACGGTGTTG 

301 GTTTCTTTAT TGGGCGTGGG GATTGCGGAA AAATCGGGCT TGATTTCCGC 

351 ATTAATGCGC TTATTGCTCA CAAAATCCCC ACGCAAACTC ACTACTTTTA 

401 TGGTTGTTTT TACAGGGATT TTATCCAATA CGGCTTCTGA ATTGGGCTAT 

40 451 GTCGTCCTAA TCCCTTTGTC CGCCGTCATC TTTCATTCGC • TCGGCCGCCA 

501 TCCGCTTGCC GGTTTGGCTG CGGCTTTCGC CGGCGTTTCG GGCGGTTATT 

551 CGGCCAATCT GTTCTTAGGC ACAATCGATC CGCTCTTGGC AGGCATCACC 

601 CAACAGGCGG CGCAAATCAT CCATCCCGAC TACGTCGTAG GCCCTGAAGC 

651 CAACTGGTTT TTTATGGCAG CCAGTACGTT TGTGATTGCT TTGATTGGTT 

45 701 ATTTTGTTAC TGAAAAAATC GTCGAACCGC AATTGGGCCC TTATCAATCA 
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751 GATTTGTCAC AAGAAGAAAA AGACATTCGG CATTCCAATG AAATCACGCC 

801 TTTGGAATAT AAAGGATTAA TTTGGGCAGG CGTGGTGTTT GTTGCCTTAT 

851 CCGCCCTATT GGCTTGGAGC ATCGTCCCTG CCGACGGTAT TTTGCGTCAT 

901 CCTGAAACAG GATTGGTTGC CGGTTCGCCG TTTTTAAAAT CGATTGTTGT 

951 TTTTATTTTC TTGTTGTTTG CGCTGCCGGG CATTGTTTAT GGCCGGATAA 

1001 CCCGAAGTTT GCGCGGCGAA CGGGAAGTCG TTAATGCGAT GGCCGAATCG 

1051 ATGAGTACTT TGGGACTTTA TTTGGTCATC ATCTTTTTTG CCGCACAGTT 

1101 TGTCGCATTT TTTAATTGGA CGAATATTGG GCAATATATT GCCGTTAAAG 

1151 GGGCGGTGTT CTTAAAAGAA GTCGGCTTGG GCGGCAGTGT GTTGTTTATC 

1201 GGTTTTATTT TAATTTGTGC TTTTATCAAT CTGATGATAG GCTCCGCCTC 

1251 CGCGCAATGG GCGGTAACTG CGCCGATTTT CGTCCCTATG CTGATGTTGG 

1301 CCGGCTACGC GCCCGAAGTC ATTCAAGCCG CTTACCGCAT CGGTGATTCC 

1351 GTTACCAATA TTATTACGCC GATGATGAGT TATTTCGGGC TGATTATGGC 

14 01 GACGGTAATC AAATACAAAA AAGATGCGGG CGTAGGCACG CTGATTTCTA 

1451 TGATGTTGCC GTATTCCGCT TTCTTCTTAA TTGCATGGAT CGCCTTATTC 

1501 TGCATTTGGG TATTTGTTTT GGGTCTGCCC GTCGGTCCCG GCACACCCAC 

1551 ATTCTATCCG GTGCCTTAA 

This encodes a protein having amino acid sequence [<SEQ ID 140>] (SEP ID NO: 140) : 

1 MSQTDARRSG RFLRTVEWLG NMLPHPVTLF IIFIVLLLIA SAVGAYFGLS 

51 VPDPRPVGAK GRADDG LIHV VSLLDADGLI KIL THTVKNF TG FAPLGTVL 

101 VSLLGVGIA E KSGLISALMR LLLTKSPRKL TTFMWFTGI LSNTASE LGY 

151 WLIPLSAVI FHSL GRHPLA GLAAAFAGVS GGYSANLFLG TIDPLLAGIT 

201 QQAAQIIHPD YWGPEANWF FMAASTFVIA LIGYFV TEKI VEPQLGPYQS 

251 DLSQEEKDIR HSNEITPLEY KGLIW AGWF VALSALLAWS IV PADGILRH 

301 PETGLVAGSP FLKS IWFIF LLFALPGIVY G RITRSLRGE REWNAMAES 

351 MST LGLYLVI IFFAAQFVAF FNWTNIGQYI AVKGAVFLKK FRLGGSVLFI 

401 GFILICAFIN LMI GSASAQW AVTAPIFVPM LMLAGNAPQV IQAAYRIGDS 

451 VTN IITPMMS YFGLIMATVI KYKKDAGVGT LISMMLPYSA FFLIAWIALF 

501 CIWVFVL GLP VGPGTPTFYP VP* 

ORF12ng (SEP ID NO: 140) shows 97.1% identity in 522 aa overlap with ORF12-1 (SEP 
136) : 

10 20 30 40 50 60 

orf 12 - 1 . pep MSQTDTQRDGRFLRTVEWLGNMLPHPVTLFI IFIVLLLIASAVGAYFGLSVPDPRPVGAK 
I II - h M I 'I I I I M I I ' II I I I II M I I I I! I M I i I I I I I I I I I i I I I I ! I I 
orf 12ng MSQTDARRSGRFLRTVEWLGNMLPHPVTLFI IFIVLLLIASAVGAYFGLSVPDPRPVGAK 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 12 - 1 . pep GRADDGL I Y I VS LLNADGF I KI LTHTVKNFTGFAPLGTVLVSLLGVG I AEKSGL I S ALMR 
I I I I M I I- I I IM I I: I I I I I I I I I I I I I I I I II I I M II I I I I I I I I I I I I I I I I I 
orf 12ng GRADDGL I HWSLLDADGL I KI LTHTVKNFTGFAPLGTVLVSLLGVG I AEKSGL IS ALMR 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 12 - 1 . pep LLLTKSPRKLTTFMWFTGILSNTASELGYWLIPLSAI IFHSLGRHPLAGLAAAFAGVS 

III IMIIIIMIIIIIII llllllll IlllllhUIIIIIIIIIIIIIIIIII 

orf 12ng LLLTKSPRKLTTFMWFTG I LSNTASELGYWL I PLSAV IFHSLGRHPLAGLAAAFAGVS 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 12 - 1 . pep GGYS ANLFLGT IDPLLAG I TQQAAQ I IHPDYWGPEANWFFMVASTFVIALIGYFVTEKI 

I I I ; I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I h I I I I I I I I I I I I Ml I 
orf 12ng GGYS ANLFLGT I DPLLAG I TQQAAQ I IHPDYWGPEANWFFMAASTFVIALIGYFVTEKI 
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190 



200 



210 



220 



230 



240 



250 260 270 280 290 300 

orf 12-1 .pep VEPQLGPYQSDLSQEEKDIRHSNE ITPLEYKGL IWAGWFVALSALLAWS I VPADGI LRH 

1 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 M I II 1 1 Ml! 1 1 1 1 Ml 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 M 

orf 12ng VEPQLGPYQSDLSQEEKDIRHSNE ITPLEYKGL IWAGWFVALSALLAWS I VPADGI LRH 

250 260 270 280 290 300 



10 



310 320 330 340 350 360 

orf 12-1 .pep PETGLVSGSPFLKSIWFIFLLFALPGIVYGRVTRSLRGEQEWNAMAESMSTLGLYLVI 

I II 1 1 M II 1 1 1 1 M M 1 1 1 1 1 1 II II II 1 1 M M MM I M M 1 1 1 1 1 1 M 1 1 1 1 1 II I 

orf 12ng PETGLVAGSPFLKSIWFIFLLFALPGIVYGRITRSLRGEREWNAMAESMSTLGLYLVI 

310 320 330 340 350 360 



15 



370 380 390 400 410 420 

orf 12- 1 . pep I FFAAQFVAFFNWTN I GQY I AVKGATFLKEVGLGGS VLF I GF I L I CAF INLM I GS AS AQW 

I II I I I I I I II I I I I I I II I I I I I I : I I I I I I I i I I I I I I I I I I I I I I I I I I I M I I I I I 
orf 12ng I FFAAQFVAFFNWTN I GQY I AVKGAVFLKEVGLGGS VLF IGF I L I CAF INLM I GS AS AQW 

370 380 390 400 410 420 



20 



430 440 450 460 470 480 

orf 12 - 1 . pep AVTAP I FVPMLMLAGYAPEV I QAAYR I GDSVTN I I TPMMS YFGL I MATVI KYKKDAGVGT 

I M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II M 1 1 II II 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 II 1 1 1 1 II I 1 1 1 ! 

orf 12ng AVTAP I FVPMLMLAGYAPEVI QAAYR I GDSVTN 1 1 TPMMS YFGL IMATV I KYKKDAGVGT 

430 440 450 460 470 480 



25 



490 500 510 520 

orf 12- 1 . pep LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX 

II 1 1 1 1 1 1 II 1 1 M 1 1 II 1 1 II 1 1 1 1 1 M 1 1 1 M 1 1 1 1 M M 

orfl2ng LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGTPTFYPVPX 

490 500 510 520 



In addition, ORF12ng (SEP ID NO: 140) shows significant homology with a hypotehtical protein 
(SEP ID NO: 1124) from Kcoli: 



30 



35 



40 



45 



sp | P4 6 13 3 | YDAH_ECOLI HYPOTHETICAL 55.1 KD PROTEIN IN OGT-DBPA INTERGENIC REGION 
)gi | 1787597 (AE000231) hypothetical protein in ogt 5' region [Escherichia coli] 
Length = 510 
Score = 329 bits (835), Expect = 2e-89 

Identities = 178/507 (35%), Positives = 281/507 (55%), Gaps = 15/507 (2%) 

Query: 8 RSGRFLRTVEWLGNMLPHPVTXXXXXXXXXXXASAVGAYFGLSVPDPRPVGAKGRADDGL 67 

+SG+ VE +GN +PHP +A+ + FG+S +P D 
Sbjct: 13 QSGKLYGWVER I GNKVPHP FLLF I YL 1 1 VLMVTTAI LS AFGVS AKNP TDGTP 64 

Query: 68 IHWSLLDADGLI KI LTHTVKNFTGFAPXXXXXXXXXXXXI AEKSGLI SALMRLLLTKS P 127 

+ V +LL +GL L + +KNF+GFAP +AE+ GL+ ALM + + 

Sbjct: 65 VVVKNLLSVEGLHWFLPNVIKNFSGFAPLGAILALVLGAGLAERVGLLPALMVKMASHVN 124 

Query: 128 RKLTTFMVVFTGILSNTASELGYVVLIPLSAVIFHSLGRHPLAGLAAAFAGVSGGYSANL 187 

+ ++MV+F S+ +S+ V++ P+ A+IF ++GRHP+AGL AA AGV G++ANL 
Sbjct: 125 AR Y AS YM VL FIAFFSHISS D AAL V IM P PMG AL I FLAVGRH P VAGLLAA I AGVGCGFTANL 184 

Query: 188 FLGTIDPLLAGITQQAAQI IHPDYWGPEANWFFMAASTFVIALIGYFVTEKIVEPQLGP 247 

+ T D LL+GI+ +AA +P V NW+FMA+S V+ ++G +T+KI+EP+LG 
Sbjct: 185 LIVTTDVLLSGISTEAAAAFNPQMHVSVIDNWYFMASSVWLTIVGGLITDKIIEPRLGQ 244 
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Query: 24 8 YQSDLSQEEKDIRHSNEITPLEYKGLIWAGWFVALSALLAWSIVPADGILRHPETGLVA 307 

+q + ++ + + s GL AGW + A +A ++P +GILR P V 

Sbjct: 245 WQGNSDEKLQTLTESQRF GLRIAGWSLLFIAAIALMVIPQNGILRDPINHTVM 298 



5 



Query: 308 GS PFLKS I WF I FLLFALPGI VYGRITRSLRGEREWNAMAESMSTLGLYLXXXXXXXXX 367 

SPF+K IV I L F + + YG TR++R + ++ + M E M + ++ 
Sbjct: 299 PSPFIKGIVPLIILFFFWSLAYGIATRTIRRQADLPHLMIEPMKEMAGFIVMVFPLAQF 358 



Query: 368 XXXXNWTNI GQY I AVKGAVFLKE VGLGGS VLF IGF I L I CAF INLM IGS AS AQWAVTAP I F 427 

NW+N+G++IAV L+ GL G F+G L+ +F+ + I S SA W++ APIF 

Sbjct: 359 VAMFNWSNMGKFIAVGLTDILESSGLSGIPAFVGLALLSSFLCMFIASGSAIWSILAPIF 418 



10 



Query: 428 VPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGTLISMMLP 487 

VPM ML G+ P Q +RI DS + P+ + L + + +YK DA +GT S++LP 
Sbjct: 419 VPMFMLLGFHPAFAQILFRIADSSVLPLAPVSPFVPLFLGFLQRYKPDAKLGTYYSLVLP 478 



15 



Query: 4 88 YS AFFL I AW I ALFC I WVFVLGLP VGPG 514 

Y FL+ W+ + W +++GLP+GPG 
Sbjct: 479 YPLIFLWWLLMLLAW- YLVGLPIGPG 504 



Based on this analysis, including the presence of several putative transmembrane domains and the 
predicted actinin-type actin-binding domain signature (shown in bold) in the gonococcal protein, it 
is predicted that the proteins from N .meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

20 Example 17 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 141 >] (SEP ID 
NO: 141) : 



35 This corresponds to the amino acid sequence [<SEQ ID 142; ORF14>] (SEP ID NO: 142: 
ORF14) : 



25 



30 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 



1 



ACAGCCGGCG CAGCAGGTTn CnCGGTCTTC GTTTTCGTAA CGGACAGTCA 
GGTGGAGGTG TTCGGGAACA TCCAGACCGC AGTGGAAACA GGTTTTTTTC 
ATGGCATTTC GGTTTCGTCT GTGTTTGGTG CGGCGGCACA AGACTCGGCA 
ATgGCTTCGC GCAGTGCGTC TATACCGGTA TTTTCAGCAA CGGAAATGCG 
GACGGcGgCA ATTTTTCCCG CAGCGTCGCG CCATATGCCC GTGTTTTgTT 
CTTCAGACGG CAGCAGGTCG GTTTTGTTGT ACACCTTgAT GCACGGAaTA 
TCGCCGGCAT GGATTTCTTG CAGTACGTTT TCCACGTCTT CAATCTGCTG 
TCCGCTGTTC GGAGCGGCGG CATCGACGAC GTGCAGCAGC ACATCgGcTT 
gCGCGGTTTC TTCCAGCGTG GCgGAAAAGG CGGAAATCAG TTTgTGCGGC 
agATyGCTnA CGAATCCGAC GGTATCGGTC AGGATAATGC TGCATTCGGG 
ACT. . 



40 



51 
101 
151 



1 



TAGAAGXXVF VFVTDSQVEV FGNIQTAVET GFFHGISVSS VFGAAAQDSA 
MASRSASIPV FSATEMRTAA IFPAASRHMP VFCSSDGSRS VLLYTLMHGI 
SPAWISCSTF STSSICCPLF GAAASTTCSS TSACAVSSSV AEKAEISLCG 
RXLTNPTVSV RIMLHSG. . 
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Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from A ^meningitidis (strain A) 

ORF14 (SEQ ID NO: 142) shows 94.0% identity over a 167aa overlap with an ORF 
(SEP ID NO: 144) from strain A of N. meningitidis: 

10 20 30 

TAGAAGXXVFVFVTDSQVE VFGN I QTAVET 

HIM II I II I I : I - : I! I I : I I I I I 
GRQLGFLRVGGALFVITAQARVNNALCDCLTTGAAGFAVFVFVTDGQMQVFGNVQPAVET 

150 160 170 180 190 200 



orf 14 .pep 
orf 14a 



40 50 60 70 80 90 

orf 14 . pep GFFHGISVSSVFGAAAQDSAMASRSASIPVFSATEMRTAAIFPAASRHMPVFCSSDGSRS 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I II 1 1 Ml 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1' 

orf 14a GFFHGISVSSVFGAAAQYSAMASRSASIPVFSATEMRTAAIFPAASRHMPVFCSSDGSRS 
210 220 230 240 250 260 



100 110 120 130 140 150 

orf 14 .pep VLLYTLMHGISPAWISCSTFSTSSICCPLFGAAASTTCSSTSACAVSSSVAEKAEISLCG 

I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I 

orf 14a VLLYTLMHGISPAWISCSTFSTSSICCPLFGAAASTTCSSTSACAVSSSVAEKAEISLCG 
270 280 290 300 310 320 



160 

orf 14 . pep RXLTNPTVSVRIMLHSG 
I IIIIIIIIIIIIMI 

orf 14a RSLTNPTVSVRIMLHSGLMYSRRAWSSVAKSWSFAYMPDLVSRLNRLDLPTLVX 
330 340 350 360 370 380 



The complete length ORF14a nucleotide sequence [<SEQ ID 143>] (SEP ID NO: 143) is 



1 ATGGAGGATT TGCAGGAAAT CGGGTTCGAT GTCGCCGCCG TAAAGGTAGG 

51 TCGGCAGCGC GAACATCATC GTCTGCATCA TCCCCAGCCC GGCAACGGCG 

101 AGGCGGACGA TGTATTGTTT GCGTTCTTTT TGGTTGGCGG CTTCGATTTT 

151 TTGCGCGTCA TAGGGTGCGG CGGTGTAGCC TATCTGCCTG ATTTTCAACA 

201 GAATGTCGGA AAGGCGGATT TTGCCGTCGT CCCAGACGAC GCGGCAGCGG 

251 TGCGTGCTGT AATTGAGGTC GATGCGGACG ATGCCGTCTG TACGCAAAAG 

301 CTGCTGTTCG ATCAGCCAGA CGCAGGCGGC GCAGGTGATG CCGCCGAGCA 

351 TTAAAACCGC CTCGCGCGTG CCGCCGTGGG TTTCCACAAA GTCGGACTGG 

401 ACTTCGGGCA GGTCGTACAG GCGGATTTGG TCGAGGATTT CTTGGGGCGG 

4 51 CAGCTCGGTT TTTTGCGCGT CGGCGGTGCG TTGTTTGTAA TAACTGCCCA 

501 AGCCCGCGTC AATAATGCTT TGTGCGACTG CCTGACAACC GGCGCAGCAG 

551 GTTTCGCGGT CTTCGTTTTC GTAACGGACG GTCAGATGCA GGTTTTCGGG 

601 AACGTCCAGC CCGCAGTGGA AACAGGTTTT TTTCATGGCA TTTCGGTTTC 

651 GTCTGTGTTT GGTGCGGCGG CACAATACTC GGCAATGGCT TCGCGCAGTG 

701 CGTCTATACC GGTATTTTCA GCAACGGAAA TGCGGACGGC GGCAATTTTT 

751 CCCGCAGCGT CGCGCCATAT GCCCGTGTTT TGTTCTTCAG ACGGCAGCAG 

801 GTCGGTTTTG TTGTACACCT TGATGCACGG AATATCGCCG GCATGGATTT 

851 CTTGCAGTAC GTTTTCCACG TCTTCAATCT GCTGTCCGCT GTTCGGAGCG 

901 GCGGCATCGA CGACGTGCAG CAGCACATCG GCTTGCGCGG TTTCTTCCAG 

951 CGTGGCGGAA AAGGCGGAAA TCAGTTTGTG CGGCAGATCG CTGACGAATC 

1001 CGACGGTATC GGTCAGGATA ATGCTGCATT CGGGACTGAT GTACAGCCGC 

1051 CGCGCCGTCG TGTCGAGTGT GGCGAAAAGC TGGTCTTTCG CATATATGCC 
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1101 CGACTTGGTC AGCCGGTTGA ACAGACTGGA TTTGCCGACA TTGGTATAG 



This encodes a protein having amino acid sequence [<SEQ ID 144>] (SEP ID NO: 144) : 



1 MEDLQEIGFD VAAVKVGRQR EHHRLHHPQP GNGEADDVLF AFFLVGGFDF 

51 LRVIGCGGVA YLPDFQQNVG KADFAWPDD AAAVRAV I E V DADDAVCTQK 

101 LLFDQPDAGG AGDAAEH*NR LARAAVGFHK VGLDFGQWQ ADLVEDFLGR 

151 QLGFLRVGGA LFVITAQARV NNALCDCLTT GAAGFAVFVF VTDGQMQVFG 

201 NVQPAVETGF FHGISVSSVF GAAAQYSAMA SRSASIPVFS ATEMRTAAIF 

251 PAASRHMPVF CSSDGSRSVL LYTLMHGISP AWISCSTFST SSICCPLFGA 

301 AASTTCSSTS ACAVSSSVAE KAEISLCGRS LTNPTVSVRI MLHSGLMYSR 

351 RAWSSVAKS WSFAYMPDLV SRLNRLDLPT LV* 

It should be noted that this sequence includes a stop codon at position 1 1 8. 



Homology with a predicted ORF from N .gonorrhoeae 



ORF14 (SEP ID NO: 142) shows 89.8% identity over a 167aa overlap with a predicted ORF 
(ORF14.ng) (SEP IDNP: 146) from N. gonorrhoeae: 



orfl4 pep • TAGAAGXXVFVFVTDSQVEVFGNI QTAVET 30 

II III II : I I : I ^ I : : I I I ! ^ I ill 
orf 14ng GRQFGFFRVGGASFVITAQAGIDDALCDCLTADAAGFAVFAFVADGQMQVFGNVQPAVET 208 

orf 14 . pep GFFHGI S VSSVFGAAAQDSAMASRSAS I PVFS ATEMRTAAI FPAASRHMPVFCSSDGSRS 90 

Illllllllllllllll Mlllll IIIIIMIMIIMIIIIIIIIIIII IIIMM 

or f 14 ng GFFHGI SVSSVFGAAAQ YSAMASRS AS I PVFS ATEMRTAAI FPAASRHMPVFCSSDGSRS 268 

orf 14 .pep VLLYTLMHGISPAWISCSTFSTSSICCPLFGAAASTTCSSTSACAVSSSVAEKAEISLCG 150 

IIIIIIIIIM llllllllllllllllll IIIIIMI MMhllhl Mllllllll 

orf 14ng VLLYTLMHGISWAWISCSTFSTSSICCPLFRAAASTTCSSTSACTVSSKVAEKAEISLCG 328 

orf 14. pep RXLTNPTVSVRIMLHSG 167 

I Mill MINIM 

orf 14ng RSLTNPTVSVRIMLHAGLMYSRRAWSRVAKSWSFAYMPDLVSRLNRLDLPTLV 3 82 

The complete length PRF14ng nucleotide sequence [<SEQ ID 145>] (SEP ID NP: 145) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 146>] (SEP ID NP: 146) : 



1 MEDLQEIGFD VAAVKVGRQR EHHRLHHTQS GNGKADD VLF AFFLVGGFDF 

51 LRVI GCGGVA CLPDFQQNVG EADFAWPDD AAAVRAV I E V DADDAVCAQK 

101 LLFDQPDAGG AGNAAEHQHC FVRAIMGFHK VGLDFGQWQ ADLVEDFLGR 

151 QFGFFRVGGA SFVITAQAGI DDALCDCLTA DAAGFAVFAF VADGQMQVFG 

201 NVQPAVETGF FHGISVSSVF GAAAQYSAMA SRSASIPVFS ATEMRTAAIF 

251 PAASRHMPVF CSSDGSRSVL LYTLMHGISW AWISCSTFST SSICCPLFRA 

301 AASTTCSSTS ACTVSSKVAE KAEISLCGRS LTNPTVSVRI MLHAGLMYSR 

351 RAWSRVAKS WSFAYMPDLV SRLNRLDLPT LV* 
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Based on the putative transmembrane domain in the gonococcal protein, it is predicted that the 
proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens for 
vaccines or diagnostics, or for raising antibodies. 

Example 18 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 147>] (SEP ID 
NO: 147) : 

1 . . GGCCATTACT CCGACCGCAC TTGGAAGCCG CGTTTGGNCG GCCGCCGTCT 

51 GCCGTATCTG CTTTATGGCA CGCTGATTGC GGTTATTGTG ATGATTTTGA 

101 TGCCGAACTC GGGCAGCTTC GGTTTCGGCT ATGCGTCGCT GGCGGCTTTG 

151 TCGTTCGGCG CGCTGATGAT TGCGCTGTTA GACGTGTCGT CAAATATGGC 

201 GATGCAGCCG TTTAAGATGA TGGTCGGCGA CATGGTCAAC GAGGAGCAGA 

251 AAA . NTACGC CTACGGGATT CAAAGTTTCT TAGCAAATAC GGGCGCGGTC 

301 GTGGCGGCGA TTCTGCCGTT TGTGTTTGCG TATATCGGTT TGGCGAACAC 

3 51 CGCCGANAAA GGCGTTGTGC CGCAGACCGT GGTCGTGGCG TTTTATGTGG 

401 GTGCGGCGTT GCTGGTGATT ACCAGCGCGT TCACGATTTT CAAAGTGAAG 

451 GAATACGANC CGGAAACCTA CGCCCGTTAC CACGGCATCG ATGTCGCCGC 

501 GAATCAGGAA AAAGCCAACT GGATCGCACT CTTAAAA . CC GCGC . . 

This corresponds to the amino acid sequence [<SEQ ID 148; ORF16>] (SEP ID NO: 148; 
PRF16): 



1 . . GHYSDRTWKP RLXGRRLPYL LYGTLIAVIV MILMPNSGSF GFGYASLAAL 

51 SFGALMIALL DVSSNMAMQP FKMMVGDMVN EEQKXYAYGI QSFLANTGAV 

101 VAAILPFVFA YIGLANTAXK GWPQTVWA FYVGAALLVI TSAFTIFKVK 

151 EYXPETYARY HGIDVAANQE KANWIALLKX A. . 

Further work revealed the complete nucleotide sequence [<SEQ ID 149>] (SEP ID NO: 149) : 



1 ATGTCGGAAT ATACGCCTCA AACAGCAAAA CAAGGTTTGC CCGCGCTGGC 

51 AAAAAGCACG ATTTGGATGC TCAGTTTCGG CTTTCTCGGC GTTCAGACGG 

101 CCTTTACCCT GCAAAGCTCG CAAATGAGCC GCATTTTTCA AACGCTAGGC 

151 GCAGACCCGC ACAATTTGGG CTGGTTTTTC ATCCTGCCGC CGCTGGCGGG 

201 GATGCTGGTG CAGCCGATTG TCGGCCATTA CTCCGACCGC ACTTGGAAGC 

251 CGCGTTTGGG CGGCCGCCGT CTGCCGTATC TGCTTTATGG CACGCTGATT 

3 01 GCGGTTATTG TGATGATTTT GATGCCGAAC TCGGGCAGCT TCGGTTTCGG 
351 CTATGCGTCG CTGGCGGCTT TGTCGTTCGG CGCGCTGATG ATTGCGCTGT 

4 01 TAGACGTGTC GTCAAATATG GCGATGCAGC CGTTTAAGAT GATGGTCGGC 
4 51 GACATGGTCA ACGAGGAGCA GAAAGGCTAC GCCTACGGGA TTCAAAGTTT 
501 CTTAGCAAAT ACGGGCGCGG TCGTGGCGGC GATTCTGCCG TTTGTGTTTG 
551 CGTATATCGG TTTGGCGAAC ACCGCCGAGA AAGGCGTTGT GCCGCAGACC 
601 GTGGTCGTGG CGTTTTATGT GGGTGCGGCG TTGCTGGTGA TTACCAGCGC 
651 GTTCACGATT TTCAAAGTGA AGGAATACGA TCCGGAAACC TACGCCCGTT 
701 ACCACGGCAT CGATGTCGCC GCGAATCAGG AAAAAGCCAA CTGGATCGAA 
751 CTCTTGAAAA CCGCGCCTAA GGCGTTTTGG ACGGTTACTT TGGTGCAATT 
801 CTTCTGCTGG TTCGCCTTCC AATATATGTG GACTTACTCG GCAGGCGCGA 
851 TTGCGGAAAA CGTCTGGCAC ACCACCGATG CGTCTTCCGT AGGTTATCAG 
901 GAGGCGGGTA ACTGGTACGG CGTTTTGGCG GCGGTGCAGT CGGTTGCGGC 
951 GGTGATTTGT TCGTTTGTAT TGGCGAAAGT GCCGAATAAA TACCATAAGG 
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1001 CGGGTTATTT CGGCTGTTTG GCTTTGGGCG CGCTCGGCTT TTTCTCCGTT 
1051 TTCTTCATCG GCAACCAATA CGCGCTGGTG TTGTCTTATA CCTTAATCGG 
1101 CATCGCTTGG GCGGGCATTA TCACTTATCC GCTGACGATT GTGACCAACG 
1151 CCTTGTCGGG CAAGCATATG GGCACTTACT TGGGCTTGTT TAACGGCTCT 

12 01 ATCTGTATGC CTCAAATCGT CGCTTCGCTG TTGAGTTTCG TGCTTTTCCC 
1251 TATGCTGGGC GGCTTGCAGG CCACTATGTT CTTGGTAGGG GGCGTCGTCC 

13 01 TGCTGCTGGG CGCGTTTTCC GTGTTCCTGA TTAAAGAAAC ACACGGCGGG 
13 51 GTTTGA 

This corresponds to the amino acid sequence [<SEQ ID 150; ORF16-l>] (SEP ID NO: 150; 
ORF16-1) : 

1 MSEYTPQTAK QGLPALAKST IWMLSFGFLG VQTAFTLQSS QMSRIFQTLG 
51 ADPHNLGW FF ILPPLAGMLV QPIVG HYSDR TWKPRLGGRR LPYLLYGTLI 
101 AVIVMIL MPN SGS FGFGY AS LAALSFGALM IALLDV SSNM AMQPFKMMVG 
151 DMVNEEQKGY AYGIQSFLAN TG AWAAILP FVFAYIGLAN TAEKGWPQT 
201 VWAFYVGAA LLVITSA FTI FKVKEYDPET YARYHGIDVA ANQEKANWIE 
2 51 LLKTAPKAFW TVTLVQFFCW FAFQYMWTYS AGAIAENVWH TTDASSVGYQ 
301 EAGNWYGVLA AVQSVAAVIC SFVLAKVPNK YHKAGYFGCL ALGALGFFSV 
351 FFI GNQY ALV LSYTLIGIAW AGII TYPLTI VTNALSGKHM GTYLGLFNGS 
401 ICMPQIVASL LSFVLFPMLG GLQATMFLVG GWLLLGAFS VFLIKETHGG 
451 V* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N.meningitidis (strain A) 

ORF16 (SEP ID NO: 148) shows 96.7% identity over a 181aa overlap with an ORF (ORF16a) 
(SEP ID NO: 152) from strain A of N. meningitidis: 

10 20 30 

or f 1 6 . pep GHYSDRTWKPRLXGRR LPYLLYGTLIAVIV 

IIIMIIIIIM : I M I I I I I I I I I I I I I 
orf 16a IFQTLGADPHSLGW FFILPPLAGMLVQPIVG HYSDRTWKPRLGGRR LPYLLYGTLIAVIV 
50 60 70 80 90 100 

40 50 60 70 80 90 

orf 16 . pep MILMPNSGSFGFGY ASLAALSFGALMIALLDV SSNMAMQPFKMIWGDMVNEEQKXYAYGI 

1 1 1 1 1 1 : 1 1 II M 1 1 I II I M 1 1 ; 1 1 1 1 1 1 M ,1 1 II 1 1 1 1! 1 1 i I M 1 1 ; Mill 

orf 16a MILM PNSGSFGFGYA SILtAALSFGALMIALLDV SSNMAMQPFKMMVGDMVNEEQKGYAYGI 
110 120 130. 140 150 160 

100 110 120 130 140 150 

orf 16 .pep QS FLANTG AWAA I L P FVFAY I GLAN TAXKG WPQT VWAF YVGAALLV I TS A FT I FKVK 

IIIIIIIIIIIIIMIMIIIIIIIIII II M I I I I II II I I I II I II I I II I I II I II 

or f 1 6a QSFLANTG AWAAILPFVFAYIGLAN TAEKGWPQT VWAFYVGAALLVITSA FTIFKVK 
170 180 190 200 210 220 

160 170 180 

orf 16 .pep EYXPETYARYHGIDVAANQEKANWIALLKXA 

II II II I II I II II I I I I I I I I I I llhl 
orf 16a EYNPETYARYHGIDVAANQEKANWIELLKTAPKAFWTVTLVQFFCWFAFQYMWTYSAGAI 

230 240 250 260 270 280 
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or f 1 6a AENVWHTTDASS VGYQEAGNWYG VLAAVQS VAAVI CS FVLA KVPNKYHKAGYFGCLALGA 

290 300 310 320 330 340 

The complete length ORF16a nucleotide sequence [<SEQ ID 15 1>] (SEP ID NO: 151) is: 

5 1 ATGTCGGAAT ATACGCCTCA AACAGCAAAA CAAGGTTTGC CCGCGCTGGC 

51 AAAAAGCACG ATTTGGATGC TCAGTTTCGG CTTTCTCGGC GTTCAGACGG 

101 CCTTTACCCT GCAAAGCTCG CAGATGAGCC GCATCTTCCA GACGCTCGGT 

151 GCCGATCCGC ACAGCCTCGG CTGGTTCTTT ATCCTGCCGC CGCTGGCGGG 

201 GATGCTGGTG CAGCCGATTG TCGGCCATTA CTCCGACCGC ACTTGGAAGC 

10 251 CGCGTTTGGG CGGCCGCCGT CTGCCGTATC TGCTTTATGG CACGCTGATT 

301 GCGGTTATTG TGATGATTTT GATGCCGAAC TCGGGCAGCT TCGGTTTCGG 

3 51 CTATGCGTCG CTGGCGGCTT TGTCGTTCGG CGCGCTGATG ATTGCGCTGT 

4 01 TAGACGTGTC GTCAAATATG GCGATGCAGC CGTTTAAGAT GATGGTCGGC 
4 51 GACATGGTCA ACGAGGAGCA GAAAGGCTAC GCCTACGGGA TTCAAAGTTT 

15 501 CTTAGCGAAT ACGGGCGCGG TCGTGGCGGC GATTCTGCCG TTTGTGTTTG 

551 CGTATATCGG TTTGGCGAAC ACCGCCGAGA AAGGCGTTGT GCCGCAGACC 

601 GTGGTCGTGG CGTTTTATGT GGGTGCGGCG TTGCTGGTGA TTACCAGCGC 

651 GTTCACGATT TTCAAAGTGA AGGAATACAA TCCGGAAACC TACGCCCGTT 

701 ACCACGGCAT CGATGTCGCC GCGAATCAGG AAAAAGCCAA CTGGATCGAA 

20 751 CTCTTGAAAA CCGCGCCTAA GGCGTTTTGG ACGGTTACTT TGGTGCAATT 

801 CTTCTGCTGG TTCGCCTTCC AATATATGTG GACTTACTCG GCAGGCGCGA 

851 TTGCGGAAAA CGTCTGGCAC ACCACCGATG CGTCTTCCGT AGGTTATCAG 

901 GAGGCGGGTA ACTGGTACGG CGTTTTGGCG GCGGTGCAGT CGGTTGCGGC 

951 GGTGATTTGT TCGTTTGTAT TGGCGAAAGT GCCGAATAAA TACCATAAGG 

25 1001 CGGGTTATTT CGGCTGTTTG GCTTTGGGCG CGCTCGGCTT TTTCTCCGTT 

1051 TTCTTCATCG GCAACCAATA CGCGCTGGTG TTGTCTTATA CCTTAATCGG 

1101 CATCGCTTGG GCGGGCATTA TCACTTATCC GCTGACGATT GTGACCAACG 

1151 CCTTGTCGGG CAAGCATATG GGCACTTACT TGGGCCTGTT TAACGGCTCT 

1201 ATCTGTATGC CGCAAATCGT CGCTTCGCTG TTGAGTTTCG TGCTTTTCCC 

30 12 51 TATGCTGGGC GGCTTGCAGG CCACTATGTT CTTGGTAGGG GGCGTCGTCC 

1301 TGCTGCTGGG CGCGTTTTCC GTGTTCCTGA TTAAAGAAAC ACACGGCGGG 

13 51 GTTTGA 

This encodes a protein having amino acid sequence [<SEQ ID 152>] (SEP ID NO: 152) : 



35 1 MSEYTPQTAK QGLPALAKST IWMLSFGFLG VQTAFTLQSS QMSRIFQTLG 

51 ADPHSLGW FF ILPPLAGMLV QPIVG HYSDR TWKPRLGGRR LPYLLYGTLI 

101 AVIVMIL MPN SGSFGFGY AS LAALS FGALM IALLDV SSNM AMQPFKMMVG 

151 DMVNEEQKGY AYGIQSFLAN TG AWAAILP FVFAYIGLAN TAEKGWPQT 

2 01 VWAFYVGAA LLVITSA FTI FKVKEYNPET YARYHGIDVA ANQEKANWIE 

40 2 51 LLKTAPKAFW TVTLVQFFCW FAFQYMWTYS AGAIAENVWH TTDASSVGYQ 

301 EAGNWYGVLA AVQSVAAVIC SFVLAKVPNK YHKAGYFGCL ALGALGFFSV 

351 . FFI GNQY ALV LSYTLIGIAW AGII TYPLTI VTNALSGKHM GTYLGLFNGS 

401 ICMPQIVASL LSFVLFPMLG GLQATMFLVG GWLLLGAFS VFLIKETHGG 

451 V* 



45 



ORF16a (SEP ID NO: 152) and ORF16-1 fSEO ID NO: 150) show 99.6% identity in 451 aa 
overlap: 



10 20 30 40 50 60 

orf 16a . pep MSEYTPQTAKQGLPALAKSTIWMLSFGFLGVQTAFTLQSSQMSRIFQTLGADPHSLGWFF 

1 1 1! I 1 1 1 1 1 1 1 1 1 1 ! I i 1 1 1 M 1 1 1 1 1 II 1 1 I II 1 1 1 1 M 1 : 1 1 1 1 1 M h II I M 

orf 16-1 MSEYTPQTAKQGLPALAKSTIWMLSFGFLGVQTAFTLQSSQMSRIFQTLGADPHNLGWFF 

10 20 30 40 50 60 
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70 80 90 100 110 120 

or f 1 6a . pep I LPPLAGMLVQP I VGHYSDRTWKPRLGGRRLPYLLYGTLI AVIVM ILMPNSGS FGFGYAS 

I L j 1 1 i 1 1 1 j I i I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 16 - 1 ILPPLAGMLVQP I VGHYSDRTWKPRLGGRRLPYLLYGTLI AVI VMILMPNSGS FGFGYAS 

5 70 80 90 100 110 120 

130 140 150 160 170 180 

orf 16a . pep LAALSFGALMIALLDVSSNMAMQPFKMM^ 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 M I 1 1 1 1 1 1 1 1 1 1 1 ! I ' M I I 1 1 ■ 1 1 1 1 1 II II II I 

orf 16 - 1 LAALSFGALMIALLDVSSNMAMQPFKM^GDNTVNEEQKGYAYGIQSFLANTGAVVAAILP 
10 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 16a . pep FVFAY I GLANTAEKGWPQTVWAFYVGAALLVI TSAFT I FKVKE YNPETYARYHG IDVA 

IIIIIIIIIIIIIIIIIIIIIIIIMIMIIIIII I I IMIIMIIIIIIIIIII 

orf 16-1 FVFAYIGLANTAEKGVVPQTVWAFYVGAALLVITSAFTI FKVKEYDPETYARYHGIDVA 

15 190 200 210 220 230 240 

250 260 270 280 290 300 

orf 16a . pep ANQEKANWIELLKTAPKAFWTVTLVQFFCWFAFQYMWTYSAGAIAENVWHTTDASSVGYQ 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 16 - 1 ANQEKANWIELLKTAPKAFWTVTLVQFFCWFAFQYMWTYSAGAIAENVWHTTDASSVGYQ 
20 2 50 260 2 70 280 2 90 3 00 

310 320 330 340 350 360 

orf 16a . pep EAGNWYGVLAAVQSVAAVICSFVLAKVPNKYHKAGYFGCLALGALGFFSVFFIGNQYALV 

1 1 II 1 1 I 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II I I IM I ! II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II II I 

or f 1 6 - 1 EAGNWYGVLAAVQSVAAVI CS FVLAKVPNKYHKAGYFGCLALGALGFFS VFF I GNQYALV 

25 310 320 330 340 350 360 

370 380 390 400 410 420 

orf 16a. pep LSYTLIGIAWAGIITYPLTIVTNALSGKHMGTYLGLFNGSICMPQIVASLLS FVLFPMLG 

IIIIMIMMIIIIIIIMIIIIIIIIIIIIMIMIIIIIIIIIIIIIIIIIIIIIM 

orf 16-1 LSYTLIGIAWAGIITYPLTIVTNALSGKHMGTYLGLFNGSICMPQIVASLLSFVLFPMLG 
30 370 * 380 390 400 410 420 

430 440 450 

orf 16a . pep GLQATMFLVGGWLLLGAFSVFLIKETHGGVX 

I I I I I I I I I I I I I ' I I I I I I I I ' I I M I 
orf 16- 1 GLQATMFLVGGWLLLGAFS VFL I KETHGGVX 

35 430 440 450 

Homology with a predicted ORF from N. gonorrhoeae 

ORF16 (SEP ID NO: 148) shows 93.9% identity over a 181aa overlap with a predicted ORF 
(ORF1 6.ng) (SEP ID NO: 154) from N. gonorrhoeae: 

orf 16. pep GHYSDRTWKPRLXGRRLPYLLYGTLIAVIV 30 • 

40 I : I I I I I II I I I I I I I I I I I I I I I I I I I I 

orf 16ng HFSNARRRPAQFGLVFHPAAAGGDAGSADSGYYSDRTWKPRLGGRRLPYLLYGTLIAVIV 131 

orf 16 . pep MILMPNSGSFGFGYASLAALSFGALMIALLDVSSNHAMQPFKMKVGDNTTO 90 

IIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIMIIIIII Mill 

orf 16ng MILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKSYAYGI 191 



45 orf 16 .pep QS FLANTGAWAAI LP FVFAY I GLANTAXKGWPQTVWAFYVGAALLV I TSAFT I FKVK 150 
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Illllll Illllll MINIM II llllllllllllllllllhlllllll III 

orf 16ng QS FLANTDAWAAI LP FVFAY I GLANTAEKGWPQTVWAFYVGAALL 1 1 TS AFT I SKVK 251 

orf 16 .pep EYXPETYARYHGIDVAANQEKANWIALLKXA 181 

II M I M I I I I I M I I I M I II M llhl 
orf 16ng EYDPETYARYHGIDVAANQEKANWFELLKTAPKVFWTVTPVQFFCWFAFRYMWTYSAGAI 311 

The complete length ORF16ng nucleotide sequence [<SEQ ID 153>] fSEOIDNO: 153) is: 



1 ATGATAGGGG ATCGCCGCGC CGGCAACCAT TTCGGATTTT CCAAAGCAAA 

51 TACTTTTCAA ATCAAAAAAA AGGATTTACT TTATGTCGGA ATATACGCCT 

101 CAAACAGCAA AACAAGGTTT GCCCGCGCCG GCAAAAAGCA CGATTTGGAT 

151 GTTGAGCTTC GGCTATCTCG GCGTTCAGAC GGCCTTTACC CTGCAAAGCT 

201 CGCAGATGAG CCGCATTTTT CAAACGCTAG GCGCAGACCC GCACAATTTG 

251 GGCTGGTTTT TCATCCTGCC GCCGCTGGCG GGGATGCTGG TTCAGCCGAT 

301 AGTGGCTACT ACTCAGACCG CACTTGGAAG CCGCGCTTGG GCGGCCGCCG 

351 ' CCTGCCGTAT CTGCTTTACG GCACGCTGAT TGCGGTCATC GTGATGATTT 

401 TGATGCCGAA CTCGGGCAGC TTCGGTTTCG GCTATGCGTC GCTGGCGGCC 

4 51 TTGTCGTTCG GCGCGCTGAT GATTGCGCTG TTGGACGTGT CGTCGAATAT 

501 GGCGATGCAG CCGTTTAAGA TGATGGTCGG CGATATGGTC AACGAGGAGC 

551 AGAAAAGCTA CGCCTACGGG ATTCAAAGTT TCTTAGCGAA TACGGACGCG 

601 GTTGTGGCAG CGATTCTGCC GTTTGTGTTC GCGTATATCG GTTTGGCGAA 

651 CACTGCCGAG AAAGGCGTTG TGCCACAAAC CGTGGTCGTA GCATTCTATG 

701 TGGGTGCGGC GTTACTGATT ATTACCAGTG CGTTCACAAT CTCCAAAGTC 

751 AAAGAATACG ACCCGGAAAC CTACGCCCGT TACCACGGCA TCGATGTCGC 

801 CGCGAATCAG GAAAAAGCCA ACTGGTTCGA ACTCTTAAAA ACCGCGCCTA 

851 AAGTGTTTTG GACGGTTACT CCGGTACAGT TTTTCTGCTG GTTCGCCTTC 

901 CGGTATATGT GGACTTACTC GGCAGGCGCG ATTGCAGAAA ACGTCTGGCA 

951 CACTACCGAT GCGTCTTCCG TAGGCCATCA GGAGGCGGGC AACCGGTACG 

1001 GCGTTTTGGC GGCGGTGTAG 

This encodes a protein having amino acid sequence [<SEQ ID 154>] (SEP ID NO: 154) : 



1 MIGDRRAGNH FGFSKANTFQ IKKKDLLYVG IYASNSKTRF ARAGKKHDLD 

51 VELRLSRRSD GLYPAKLADE PHFSNARRRP AQFGLVFHPA AAGGDAGSAD 

101 SGYYSDRTWK PRLGGRR LPY LLYGTLIAVI VMILM PNSGS FGFGYA SLAA 

151 LSFGALMIAL LDV SSNMAMQ PFKMMVGDMV NEEQKSYAYG IQSFLANTDA 

201 WAAILPFVF AYIGLANTAE KGWPQTVW AFYVGAALLI ITSAFTISKV 

251 KEYDPETYAR YHGIDVAANQ EKANWFELLK TAPKVFWTVT PVQFFCWFAF 

301 RYMWTYSAGA IAENVWHTTD ASSVGHQEAG NRYGVLAAV* 

ORF16ng (SEP ID NO: 154) and ORF16-1 (SEP ID NO: 150) show 89.3% identity in 261 
overlap: 



30 40 50 60 70 80 

orf 16-1. pep MLSFGFLGVQTAFTLQSSQMSRIFQTLGADPHNLGWFFILPPLAGMLVQPI-VGHYSDRT 

I ::| I I II : hlllll 
orf 16ng DVELRLSRRSDGLYPAKLADEPHFSNARRRPAQFGLVF - HPAAAGGDAGSADSGYYSDRT 

50 60 70 80 90 100 



90 100 110 120 130 140 

orf 16-1. pep WKPRLGGRRLPYLLYGTLIAVIVMILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMA 

I M M 1 1 1 1 M I II 1 1 1 1 1 M 1 1 1 M 1 1 M 1 1 M 1 1 1 M MM II II II 1 1 1 M M I M 

orf 16ng WKPRLGGRRLPYLLYGTLIAVIVMILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMA 
110 120 130 140 150 160 
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150 160 170 180 190 200 

orf 16-1 .pep MQPFKMMVGDMVNEEQKGYAYG I QS FLANTGA WAA I LPFVFAY I GLANTAEKGWPQTV 

IIIIIIIIIIIIIIMhIIIIIIIIIIII I i I . M I M 1 1 i 1 1 1 II 1 1 1 1 1 1 1 M 1 1 ! 

orf 16ng MQPFKMMVGDMVNEEQKSYAYGIQSFLANTDAWAA I LPFVFAY IGLANTAEKGWPQTV 

170 180 190 200 210 220 

210 220 230 240 250 260 

orf 16 - 1 . pep WAFYVGAALLVI TS AFT I FKVKE YDPETYARYHG I DVAANQEKANW I ELLKTAPKAFWT 

lllllli llhlllMM I I I I I I I I I I I I I I M I I I I M I I hi II I I I I I : I 
orf 16ng WAFYVGAALLIITSAFTISKVKEYDPETYARYHGIDVAANQEKANWFELLKTAPKVFWT 
230 240 250 260 270 280 

270 280 290 300 310 320 

orf 16-1 .pep VTLVQFFCWFAFQYMWTYSAGAIAE^^VWHTTDASSVGYQEAGNWYGVLAAVQSVAAVICS 

II II h 1 1 1 1 h h 1 1 1 h 1 1 h 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 IMIIII 

orf 16ng VTPVQFFCWFAFRYMWTYSAGAIAENVWHTTDASSVGHQEAGNRYGVLAAVX 
290 300 310 320 330 340 

Based on this analysis, including the presence of several putative transmembrane domains in the 
gonococcal protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 19 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 155>] (SEO ID 
NO: 155) : 



1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGCATA CCTTGATGCT 

51 GAACGGCTGT ACGTTGATGT TGTGGGGAAT GAACAACCCG GTCAGCGAAA 

101 CAATCACCCG NAAACACGTT GNCAAAGACC AAATCCGNGN CTTCGGTGTG 

151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG 

201 CGGAAAATAC TGGTTCGTCG TCAATCCCGA AGATTCGGCG AA.NTGACGG 

251 GNATTTTGAN GGCAGGGCTG GACAAACCCT TCCAAATAGT TNAGGATACC 

301 CCGAGCTATG C.TGCCACCA AGCCCTGCCG GTCAAACTCG GATCGNCTGG 

351 CAGCCAGAAT . . . 

This corresponds to the amino acid sequence [<SEQ ID 156; ORF28>] (SEO ID NO: 156; 
ORF28): 



1 MLFRKTTAAV LAHTLMLNGC TLMLWGMNNP VSETITRKHV XKDQIRXFGV 
51 VAEDNAQLEK GSLVMMGGKY WFWNPEDSA XXTGILXAGL DKPFQIVXDT 
101 PSYXCHQALP VKLGSXGSQN. . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 157>] (SEO ID NO: 157) : 



1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGGCAA CCTTGATGCT 

51 GAACGGCTGT ACGTTGATGT TGTGGGGAAT GAACAACCCG GTCAGCGAAA 

101 CAATCACCCG CAAACACGTT GACAAAGACC AAATCCGCGC CTTCGGTGTG 

151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG 

201 CGGAAAATAC TGGTTCGTCG TCAATCCCGA AGATTCGGCG AAGCTGACGG 
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251 GCATTTTGAA GGCAGGGCTG GACAAACCCT TCCAAATAGT TGAGGATACC 

301 CCGAGCTATG CTCGCCACCA AGCCCTGCCG GTCAAACTCG AATCGCCTGG 

3 51 CAGCCAGAAT TTCAGTACCG AAGGCCTTTG CCTGCGCTAC GATACCGACA 

4 01 AGCCTGCCGA CATCGCCAAG CTGAAACAGC TCGGGTTTGA AGCGGTCAAA 
4 51 CTCGACAATC GGACCATTTA CACGCGCTGC GTATCCGCCA AAGGCAAATA 
501 CTACGCCACA CCGCAAAAAC TGAACGCCGA TTACCATTTT GAGCAAAGTG 
551 TGCCTGCCGA TATTTATTAC ACGGTTACTG AAGAACATAC CGACAAATCC 
601 AAGCTGTTTG CAAATATCTT ATATAGGCCC CCCTTTTTGA TACTGGATGC 
651 GGCGGGCGCG GTACTGGCCT TGCCTGCGGC GGCTCTGGGT GCGGTCGTGG 
701 ATGCCGCCCG CAAATGA 

This corresponds to the amino acid sequence [<SEQ ID 158; ORF28-l>] (SEP ID NO: 158; 
PRF28-1) : 

1 MLFRKTTAAV LAATLMLNG C TLMLWGMNNP VSETITRKHV DKDQIRAFGV 

51 VAEDNAQLEK GSLVMMGGKY WFWNPEDSA KLTGILKAGL DKPFQIVEDT 

101 PSYARHQALP VKLESPGSQN FSTEGLCLRY DTDKPADIAK LKQLGFEAVK 

151 LDNRTIYTRC VSAKGKYYAT PQKLNADYHF EQSVPADIYY TVTEEHTDKS 

2 01 KLFANILYTP PF LILDAAGA VLALPAAAL G AWDAARK* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF28 (SEP ID NO: 156) shows 79.2% identity over a 120aa overlap with an ORF (ORF28a) 
(SEP ID NP: 160) from strain A of TV. meningitidis: 

10 20 30 40 50 60 

orf 28 . pep MLFRKTTAAVLAHTLMLNG CTLMLWGNn^PVSETITRKHVXKDQIRXFGVVAEDNAQLEK 

Illlllllllll MMIIIhM I : III Mill Mill lllllllllllll 
orf 28a MLFRKTTAAVLAATLMLNG CTVMMWGMNSPFSETTAJ^KHVDKDQIRAFGVVAEDNAQLEK 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 2 8 . pep GSLVMMGGKYWFWNPEDSAXXTGILXAGLDKPFQIVXDTPSYXCHQALPVKLGSXGSQN 

Illlllllllllllllllll 1 1 1 1 Mill MM M : MMMM I MM 

orf 2 8a GSLVMMGGKYWFWNPEDSAKLTGILKAGLDKQFQMVEPNPRFA- YQALPVKLESPASQN 

70 80 90 100 110 

orf 2 8a FSTEGLCLRYDTDRPADIAKLKQLEFEAVELDNRTIYTRCVSAKGKYYATPQKLNADYHF 
120 130 140 150 160 170 

The complete length PRF28a nucleotide sequence [<SEQ ID 159>] (SEPIDNP: 159) is: 

1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGGCAA CCTTGATGTT 

51 GAACGGCTGT ACGGTAATGA TGTGGGGTAT GAACAGCCCG TTCAGCGAAA 

101 CGACCGCCCG CAAACACGTT GACAAGGACC AAATCCGCGC CTTCGGTGTG 

151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG 

2 01 CGGGAAATAC TGGTTCGTCG TCAATCCTGA AGATTCGGCG AAGCTGACGG 
251 GCATTTTGAA GGCCGGGTTG GACAAGCAGT TTCAAATGGT TGAGCCCAAC 

3 01 CCGCGCTTTG CCTACCAAGC CCTGCCGGTC AAACTCGAAT CGCCCGCCAG 
351 CCAGAATTTG AGTACCGAAG GCCTTTGCCT GCGCTACGAT ACCGACAGAC 

4 01 CTGCCGACAT CGCCAAGCTG AAACAGCTTG AGTTTGAAGC GGTCGAAGTC 
451 GACAATCGGA CCATTTACAC GCGCTGCGTC TCCGCCAAAG GCAAATACTA 
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501 CGCCACACCG CAAAAACTGA ACGCCGATTA TCATTTTGAG CAAAGTGTGC 

551 CTGCCGATAT TTATTACACG GTTACGAAAA AACATACCGA CAAATCCAAG 

601 TTGTTTGAAA ATATTGCATA TACGCCCACC ACGTTGATAC TGGATGCGGT 

651 GGGCGCGGTG CTGGCCTTGC CTGTCGCGGC GTTGATTGCA GCCACGAATT 

5 701 CCTCAGACAA ATGA 

This encodes a protein having amino acid sequence [<SEQ ID 160>] (SEO ID NO: 160) : 

1 MLFRKTTAAV LAATLMLNG C TVMMWGMNSP FSETTARKHV DKDQIRAFGV 

51 VAEDNAQLEK GSLVMMGGKY WFWNPEDSA KLTGILKAGL DKQFQMVEPN 

10 101 PRFAYQALPV KLESPASQNF STEGLCLRYD TDRPADIAKL KQLEFEAVEL 

151 DNRTIYTRCV SAKGKYYATP QKLNADYHFE QSVPADIYYT VTKKHTDKSK 

201 LFENIAYTPT TL ILDAVGAV LALPVAALIA ATNSSDK* 

ORF28a (SEO ED NO: 160) and ORF28-1 (SEO ID NO: 158) show 86.1% identity in 238 aa 
15 overlap: 

10 20 30 40 50 60 

or f 2 8a . pep MLFRKTTAAVLAATLMLNGCTVMMWGMNS P FS ETTARKHVDKDQ I RAFGWAEDNAQLEK 

III II III II I II I II Mllhhl II hi III :| 1 1 1 1 M 1 1 1 1 ! I M 1 M 1 1 1 1 1 

orf 28-1 MLFRKTTAAVLAATLMLNGCTLMLWGMNNPVSETITRKHVDKDQIRAFGWAEDNAQLEK 
20 10 20 30 40 50 60 

70 80 90 100 110 119 

orf 28a . pep GSLVMMGGKYWFVVNPEDSAKLTGILKAGLDKQFQMVEPNPRFA- YQALPVKLESPASQN 

M I I I M I M I I I I I I II M II I I M M I II i I : I I I H :|||||||llhlll 
or f 2 8 - 1 GSLVMMGGKYWFWNPEDSAKLTGILKAGLDKPFQIVEDTPSYARHQALPVKLESPGSQN 
25 70 80 90 100 110 120 

120 130 140 150 160 170 179 

orf 28a . pep FSTEGLCLRYDTDRPADIAKLKQLEFEAVELDNRTIYTRCVSAKGKYYATPQKLNADYHF 

M IIIIIIIIIMMII III 1 1 1 1 : 1 1 1 1 : 1 I I I I I I I I I I I I I I I I I I I I I I 

orf28-l FSTEGLCLRYDTDKPADIAKLKQLGFEAVKLDNRTIYTRCVSAKGKYYATPQKLNADYHF 
30 130 140 150 160 170 180 

180 190 200 210 220 230 

orf 2 8a . pep EQS VP AD I YYTVTKKHTDKSKLFEN I AYTPTTL I LDAVGAVLALP VAAL I AATNS SDKX 

I IIIIIIMII-MII Ml || Ml I I I I h I II I I I h I I I | : : : : : II 
orf28-l EQSVPADIYYTVTEEHTDKSKLFANILYTPPFLILDAAGAVLALPAAALGAWDAARKX 
35 190 200 210 220 230 

Homology with a predicted ORF from N. gonorrhoeae 

ORF28 (SEO ID NO: 156) shows 84.2% identity over a 120aa overlap with a predicted ORF 
(ORF28.ng) (SEO ID NO: 162) from N. gonorrhoeae: 

orf 2 8 . pep MLFRKTTAAVLAHTLMLNGCTLMLWGMNNP VSET I TRKHVXKDQ I RXFGWAEDNAQLEK 60 

40 I Mill II II II I h II II M M llllllhlllllll Mill 1 1 Ml I Ml I II 

orf 2 8ng MLFRKTTAAVIJ^TLILNGCTMMLRGMNNPVSQTITRKHVDKPQIRAFGW 60 

orf 2 8 . pep GSLVMMGGKYWFWNPEDSAXXTGILXAGLDKPFQIVXDTPSYXCHQALPVKLGSXGSQN 120 

1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 E 11 = 1 IMIMIMI Mill llllllh : MM 

orf28ng GSLVMMGGKYWFAVNPEDSAKLTGLLKAGLDKPFQIVEDTPSYARHQALPVKFEAPGSQN 120 
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The complete length ORF28ng nucleotide sequence [<SEQ ID 161>] (SEP ID NO: 161) is 

1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGGCAA CCTTGATACT 

51 GAACGGCTGT ACGATGATGT TGCGGGGGAT GAACAACCCG GTCAGCCAAA 

5 101 CAATCACCCG CAAACACGTT GACAAAGACC AAATCCGCGC CTTCGGTGTG 

151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG 

201 CGGGAAATAC TGGTTCGCCG TCAATCCCGA AGATTCGGCG AAGCTGACGG 

2 51 GCCTTTTGAA GGCCGGGTTG GACAAGCCCT TCCAAATAGT TGAGGATACC 

3 01 CCGAGCTATG CCCGCCACCA AGCCCTGCCG GTCAAATTCG AAGCGCCCGG 
10 3 51 CAGCCAGAAT TTCAGTACCG GAGGTCTTTG CCTGCGCTAT GATACCGGCA 

4 01 GACCTGACGA CATCGCCAAG CTGAAACAGC TTGAGTTTAA AGCGGTCAAA 
4 51 CTCGACAATC GGACCATTTA CACGCGCTGC GTATCCGCCA AAGGCAAATA 
501 CTACGCCACG CCGCAAAAAC TGAACGCCGA TTATCATTTT GAGCAAAGTG 
551 TGCCCGCCGA TATTTATTAT ACGGTTACTG AAAAACATAC CGACAAATCC 

15 601 AAGCTGTTTG GAAATATCTT ATATACGCCC CCCTTGTTGA TATTGGATGC 

.651 GGCGGCCGCG GTGCTGGTCT TGCCTATGGC TCTGATTGCA GCCGCGAATT 
701 CCTCAGACAA ATGA 

This encodes a protein having amino acid sequence [<SEQ ID 162>] (SEP ID NO: 162) : 

20 1 MLFRKTTAAV LAATLILNG C TMMLRGMNNP VSQTITRKHV DKDQIRAFGV 

51 VAEDNAQLEK GSLVMMGGfCY WFAVNPEDSA KLTGLLKAGL DKPFQIVEDT 
101 PSYARHQALP VKFEAPGSQN FSTGGLCLRY DTGRPDD I AK LKQLEFKAVK 
151 LDNRTIYTRC VSAKGKYYAT PQKLNADYHF EQSVPADIYY TVTEKHTDKS 
2 01 KLFGNILYTP PL LILDAAAA VLVLPMALIA AANSSDK* 

25 

PRF28ng (SEP ID NP: 162) and PRF28-1 (SEP ID NP: 158) share 90.0% identity in 231 aa 
overlap: 

10 20 30 40 50 60 

or f 2 8 - 1 . pep MLFRKTTAAVLAATLMLNGCTLMLWGMNNPVS ET I TRKHVDKDQ I RAFGWAEDNAQLEK 

30 II 1 1 1 1 1 1 1 1 1 1 1 1 h 1 1 1 1 h 1 1 1 1 1 1 1 1 h 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

orf28ng ML FRKTTAAVLAATL I LNGCTMMLRGMNNPVSQT I TRKHVDKDQ I RAFGWAEDNAQLEK 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 28- 1 . pep GSLVMMGGKYWFVVNPEDSAKLTGILKAGLDKPFQIVEDTPSYARHQALPVKLESPGSQN 

35 IIIMIIIIIMIIIIIM IMMMIMI IIIIIIIIIMIII IhMM I 

orf28ng GSLVMMGGKYWFAVNPEDSAKLTGLLKAGLDKPFQIVEDTPSYARHQALPVKFEAPGSQN 

70 80 90 100 110 120 



40 



orf 28-1 .pep 
orf 2 8ng 



130 140 150 160 170 180 

FSTEGLCLRYDTDKPADIAKLKQLGFEAVKLDNRTIYTRCVSAKGKYYATPQKLNADYHF 

III MINI I llllllll Milllll IIIIIIMMII lllllllllll! 

FSTGGLCLRYDTGRPDDIAKLKQLEFKAVKLDNRT I YTRCVSAKGKYYAT PQKLNADYHF 
130 140 150 160 170 180 



190 200 210 220 230 239 

orf 28- 1 . pep EQSVPADIYYTVTEEHTDKSKLFANILYTPPFLILDAAGAVLALPAAALGAWDAARKX 
45 | || | | | | | | | | | | | : | | | | | | | | : | | | | | | | : | | | | | | : | | | : | | | : : | : 

orf28ng EQSVPADIYYTVTEKHTDKSKLFGNILYTPPLLILDAAAAVLVLPMALIAAANSSDKX 

190 200 210 220 230 
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Based on this analysis, including the presence of a putative transmembrane domain in the 
gonococcal protein, it was predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

ORF28-1 (SEP ID NO: 158) (24kDa) was cloned in pET and pGex vectors and expressed in 
E.coli, as described above. The products of protein expression and purification were analyzed by 
SDS-PAGE. Figure 6A shows the results of affinity purification of the GST-fusion protein, and 
Figure 6B shows the results of expression of the His-fusion in E.coli. Purified GST-fusion protein 
was used to immunise mice, whose sera were used for ELISA, which gave a positive result. These 
experiments confirm that ORF28-1 (SEP ID NO: 158) is a surface-exposed protein, and that it may 
be a useful immunogen. 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 163>] (SEP ID 



1 . . GTCAGTCCTG TACTGCCTAT TACACACGAA CGGACAGGGT TTGAAGGTGT 

51 TATCGGTTAT GAAACCCATT TTTCAGGGCA CGGACATGAA GTACACAGTC 

101 CGTTCGATCA TCATGATTCA AAAAGCACTT CTGATTTCAG CGGCGGTGTA 

151 GACGGCGGTT TTACTGTTTA CCAACTTCAT CGAACATGGT CGGAAATCCA 

201 TCCGGAGGAT GAATATGACG GGCCGCAAGC AGCG.ATTAT CCGCCCCCCG 

251 GAGGAGCAAG GGATATATAC AGCTATTATG TCAAAGGAAC TTCAACAAAA 

3 01 ACAAAGACTA GTATTGTCCC TCAAGCCCCA TTTTCAGACC GTTGGCTAGA 

351 AGAAAATGCC GGTGCCGCCT CTGGT. . 



This corresponds to the amino acid sequence [<SEQ ID 164; PRF29>] (SEP ID NO: 164; 



Further work revealed the complete nucleotide sequence [<SEQ ED 165>] (SEP ID NP: 165) : 



1 ATGAATTTGC CTATTCAAAA ATTCATGATG CTGTTTGCAG CAGCAATATC 

51 GTTGCTGCAA ATCCCCATTA GTCATGCGAA CGGTTTGGAT GCCCGTTTGC 

101 GCGATGATAT GCAGGCAAAA CACTACGAAC CGGGTGGTAA ATACCATCTG 

151 TTTGGTAATG CTCGCGGCAG TGTTAAAAAG CGGGTTTACG CCGTCCAGAC 

201 ATTTGATGCA ACTGCGGTCA GTCCTGTACT GCCTATTACA CACGAACGGA 

251 CAGGGTTTGA AGGTGTTATC GGTTATGAAA CCCATTTTTC AGGGCACGGA 

3 01 CATGAAGTAC ACAGTCCGTT CGATCATCAT GATTCAAAAA GCACTTCTGA 
351 TTTCAGCGGC GGTGTAGACG GCGGTTTTAC TGTTTACCAA CTTCATCGAA 

4 01 CAGGGTCGGA AATCCATCCG GAGGATGGAT ATGACGGGCC GCAAGGCAGC 



Example 20 



NP: 163) : 



PRF29) : 



i 

51 
101 



.VSPVLPITHE RTGFEGVIGY ETHFSGHGHE VHSPFDHHDS KSTSDFSGGV 
DGGFTVYQLH RTWSEIHPED EYDGPQAAXY PPPGGARDIY SYYVKGTSTK 
TKTSIVPQAP FSDRWLEENA GAASG. . 
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451 GATTATCCGC CCCCCGGAGG AGCAAGGGAT ATATACAGCT ATTATGTCAA 

501 AGGAACTTCA ACAAAAACAA AGACTAATAT TGTCCCTCAA GCCCCATTTT 

551 CAGACCGTTG GCTAAAAGAA AATGCCGGTG CCGCCTCTGG TTTTTTCAGC 

601 CGTGCGGATG AAGCAGGAAA ACTGATATGG GAAAGCGACC CCAATAAAAA 

651 TTGGTGGGCT AACCGTATGG ATGATGTTCG CGGCATCGTC CAAGGTGCGG 

701 TTAATCCTTT TTTAATGGGT TTTCAAGGAG TAGGGATTGG GGCAATTACA 

751 GACAGTGCAG TAAGCCCGGT CACAGATACA GCCGCGCAGC AGACTCTACA 

801 AGGTATTAAT GATTTAGGAA AATTAAGTCC GGAAGCACAA CTTGCTGCCG 

851 CGAGCCTATT ACAGGACAGT GCTTTTGCGG TAAAAGACGG TATCAACTCT 

901 GCCAAACAAT GGGCTGATGC CCATCCAAAT ATAACAGCTA CTGCCCAAAC 

951 TGCCCTTTCC GCAGCAGAGG CCGCAGGTAC GGTTTGGAGA GGTAAAAAAG 

1001 TAGAACTTAA CCCGACTAAA TGGGATTGGG TTAAAAATAC CGGTTATAAA 

1051 AAACCTGCTG CCCGCCATAT GCAGACTTTA GATGGGGAGA TGGCAGGTGG 

1101 GAATAAACCT ATTAAATCTT TACCAAACAG TGCCGCTGAA AAAAGAAAAC 

1151 AAAATTTTGA GAAGTTTAAT AGTAACTGGA GTTCAGCAAG TTTTGATTCA 

12 01 GTGCACAAAA CACTAACTCC CAATGCACCT GGTATTTTAA GTCCTGATAA 
1251 AGTTAAAACT CGATACACTA GTTTAGATGG AAAAATTACA ATTATAAAAG 
1301 ATAACGAAAA CAACTATTTT AGAATCCATG ATAATTCACG AAAACAGTAT 

13 51 CTTGATTCAA ATGGTAATGC TGTGAAAACC GGTAATTTAC AAGGTAAGCA 

14 01 AGCAAAAGAT TATTTACAAC AACAAACTCA TATCAGGAAC TTAGACAAAT 
1451 GA 

This corresponds to the amino acid sequence [<SEQ ID 166; ORF29-l>] (SEP ID NO: 166; 
PRF29-1) : 

1 MNLPIQKFMM LFAAAISLLQ IPISHAN GLD ARLRDDMQAK HYEPGGKYHL 

51 FGNARGSVKK RVYAVQTFDA TAVSPVLPIT HERTGFEGVI GYETHFSGHG 

101 HEVHSPFDHH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP EDGYDGPQGS 

151 DYPPPGGARD IYSYYVKGTS TKTKTNIVPQ APFSDRWLKE NAGAASGFFS 

2 01 RADEAGKLIW ESDPNKNWWA NRMDDVRGIV QGAVNPFLMG FQGVGIGAIT 

2 51 DSAVSPVTDT AAQQTLQGIN DLGKLSPEAQ LAAASLLQDS AFAVKDGINS 
301 AKQWADAHPN ITATAQTALS AAEAAGTVWR GKKVELNPTK WDWVKNTGYK 

3 51 KPAARHMQTL DGEMAGGNKP IKSLPNSAAE KRKQNFEKFN SNWSSASFDS 

4 01 VHKTLTPNAP GILSPDKVKT RYTSLDGKIT I IKDNENNYF RIHDNSRKQY 
4 51 LDSNGNAVKT GNLQGKQAKD YLQQQTHIRN LDK* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from meningitidis (strain A) 

ORF29 (SEP ID NO: 164) shows 88.0% identity over a 125aa overlap with an ORF (ORF29a) 
(SEP ID NO: 168) from strain A of N. meningitidis: 

10 20 30 

orf29 pep VSPVLP I THERTGFEGVIGYETHFSGHGHE 

M : I I I I I I I I I I M I I I I I II i I I I 
or f 2 9a EPGGKYHLFGNARGSVKNRVYAVQTFDATAVGPILPITHERTGFEGI IGYETHFSGHGHE 

50 60 70 80 90 100 

40 50 60 70 80 90 

orf 2 9. pep VHSPFDHHDSKSTSDFSGGVDGGFTVYQLHRTWSEIHPEDEYDGPQAAXYPPPGGARDIY 

I M MM 1 1 M II 1 1 II 1 1 1 1 M 1 1 1 1 II 1 1 MUM Mill:: I i I M 1 1 1 II 1 ' 

orf 2 9a VHSPFDNHDSKSTSDFSGGVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIY 
110 120 130 140 150 160 ■ 
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100 110 120 

orf 29 . pep SYYVKGTSTKTKTSIVPQAPFSDRWLEENAGAASG 

1 1 N 1 1 1 1- 1 1 M I II 1 1 1 M 1 1 1 1 1 1 

orf 29a XXYVKGTSTKTKSNIVPRAPFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANR 
5 170 180 190 200 210 220 

orf 2 9a MDD I RG I VQGAVNP FLMGFQGVG I GAI TDS AVS PVTDTAAQQTLQGXNHLGXLS PEAQLA 

230 240 250 260 270 280 

The complete length ORF29a nucleotide sequence [<SEQ ID 167>] (SEP ID NO: 167) is: 

10 1 ATGAATTNGC CTATTCAAAA ATTCATGATG CTGTTTGCAG CAGCAATATC 

51 GTNGCTGCAA ATCCCNATTA GTCATGCGAA CGGTTTGGAT GCCCGTTTGC 

101 GCGATGATAT GCAGGCAAAA CACTACGAAC CGGGTGGTAA ATACCATCTG 

151 TTTGGTAATG CTCGCGGCAG TGTTAAAAAT CGGGTTTACG CCGTCCAAAC 

2 01 ATTTGATGCA ACTGCGGTCG GCCCCATACT GCCTATTACA CACGAACGGA 
15 2 51 CAGGATTTGA AGGCATTATC GGTTATGAAA CCCATTTTTC AGGACATGGA 

301 CATGAAGTAC ACAGTCCGTT CGATAATCAT GATTCAAAAA GCACTTCTGA 

3 51 TTTCAGCGGC GGCGTAGACG GTGGTTTTAC CGTTTACCAA CTTCATCGGA 
401 CAGGGTCGGA AATCCATCCG GAGGATGGAT ATGACGGGCC GCAAGGCAGC 

4 51 GATTATCCGC CCCCCGGAGG AGCAAGGGAT ATATACANNT ANTATGTCAA 
20 501 AGGAACTTCA ACAAAAACAA AGAGTAATAT TGTTCCCCGA GCCCCATTTT 

551 CAGACCGCTG GCTAAAAGAA AATGCCGGTG CCGCCTCTGG TTTTTTCAGC 

601 CGTGCTGATG AAGCAGGAAA ACTGATATGG GAAAGCGACC CCAATAAAAA 

651 TTGGTGGGCT AACCGTATGG ATGATATTCG CGGCATCGTC CAAGGTGCGG 

701 TTAATCCTTT TTTAATGGGT TTTCAAGGAG TAGGGATTGG GGCAATTACA 

25 751 GACAGTGCAG TAAGCCCGGT CACAGATACA GCCGCGCAGC AGACTCTACA 

801 AGGTATNAAT CATTTAGGAA ANTTAAGTCC CGAAGCACAA CTTGCGGCTG 

851 CAACCGCATT ACAAGACAGT GCTTTTGCGG TAAAAGACGG TATCAATTCC 

901 GCCAGACAAT GGGCTGATGC CCATCCGAAT ATAACTGCAA CAGCCCAAAC 

951 TGCCCTTGCC GTAGCAGANG CCGCAACTAC GGTTTGGGGC GGTAAAAAAG 

30 1001 TAGAACTTAA CCCGACCAAA TGGGATTGGG TTAAAAATAC NGGCTATAAN 

1051 ACACCTGCTG TTCGCACCAT GCATACTTTG GATGGGGAAA TGGCCGGTGG 

1101 GAATAGACCG CCTAAATCTA TAACGTCCAA CAGCAAAGCA GATGCTTCCA 

1151 CACAACCGTC TTTACAAGCG CAACTAATTG GAGAACAAAT TANNNNNGGG 

12 01 CATGCTTATA ACAAGCATGT CATAAGACAA CAAGAATTTA CGGATTTAAA 
35 1251 TATCAATTCA CCAGCAGATT TTGCTCGGCA TATTGAAAAT ATTGTTAGCC 

13 01 ATCCANCAAA TATGAAAGAG TTACCTCGCG GTAGAACTGC GTATTGGGAT 
13 51 NATAAAACAG GGACNATAGT TATCCGAGAT AAAAATTCTG ACGATGGAGG 
1401 TACAGCATTT AGACCAACAT CAGGTAAAAA ATATTATGAT GATTTATAG 

40 This encodes a protein having amino acid sequence [<SEQ ID 168>] (SEP ID NO: 168) : 

1 MNXPIQKFMM LFAAAISXLQ IPISHAN GLD ARLRDDMQAK HYEPGGKYHL 

51 FGNARGSVKN RVYAVQTFDA TAVGPILPIT HERTGFEGII GYETHFSGHG 

' 101 HEVHSPFDNH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP EDGYDGPQGS 

151 DYPPPGGARD IYXXYVKGTS TKTKSNIVPR APFSDRWLKE NAGAASGFFS 

45 201 RADEAGKLIW ESDPNKNWWA NRMDDIRGIV QGAVNPFLMG FQGVGIGAIT 

251 DSAVSPVTDT AAQQTLQGXN HLGXLSPEAQ LAAATALQDS AFAVKDGINS 

3 01 ARQWADAHPN ITATAQTALA VAXAATTVWG GKKVELNPTK WDWVKNTGYX 
351 TPAVRTMHTL DGEMAGGNRP PKSITSNSKA DASTQPSLQA QLIGEQIXXG 

4 01 HAYNKHVIRQ QEFTDLNINS PADFARHIEN IVSHPXNMKE LPRGRTAYWD 
50 .451 XKTGTIVIRD KNSDDGGTAF RPTSGKKYYD DL* 

ORF29a (SEP ID NO: 168) and ORF29-1 (SEP ID NO: 166) show 90.1% identity in 385 aa 



overlap: 
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10 20 30 40 50 60 

orf29a.pep MNXP IQKFMMLFAAAISXLQI P I SHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKN 

II llllllllllllll II I II 1 1 1 1 II II II 1 1 III 1 1 1 1 M I III M 1 1 1 1 1 1 1 M 

orf2 9-l MNL P I Q KFMML FAAA ISLLQIPIS HANGLD ARLRDDMQAKH YE PGGKYHL FGNARGS VKK 

5 10 20 30 40 50 60 

70 80 90 100 110 120 

orf2 9a.pep RVYAVQTFDATAVGPILPITHERTGFEGIIGYETHFSGHGHEVHSPFDNHDSKSTSDFSG 
II I I I I M I I I I MM I I I I I I M I I Ml I I I I I I I I I I I I II I I M I M M I I I I I 
orf 2 9 - 1 RVYAVQTFDATAVS PVLP I THERTGFEGVI GYETHFSGHGHEVHS PFDHHDS KSTSDFSG 

l'O 70 80 90 100 110 120 

130 140 150 160 170 180 

orf 2 9a. pep GVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIYXXYVKGTSTKTKSNIVPR 

M 1 1 1 II I II M M 1 1 1 II II M I II II 1 1 MM Ml MM 1 1 1 1 1 1 1 1 II : 1 1 1 1 = 

orf 2 9-1 GVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIYSYYVKGTSTKTKTNIVPQ 
15 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 2 9a. pep APFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANRMDDIRGIVQGAVNPFLMG 

II MINIM IIIIIIIIIIMIIIIMIIIMIIIIII IIIIIMIIIMIIIIMII 

orf 2 9-1 APFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANRMDDVRGIVQGAVNPFLMG 
20 190 200 210 220 230 240 

250 260 270 280 290 300 

orf 29a . pep FQGVGIGAITDSAVSPVTDTAAQQTLQGXNHLGXLSPEAQLAAATALQDSAFAVKDGINS 

Illlllllllllllllllllllllllll I II lllllllllh llllllllllllll 

orf 29-1 FQGVGIGAITDSAVSPVTDTAAQQTLQGINDLGKLSPEAQLAAASLLQDSAFAVKDGINS 
25 250 260 270 280 290 300 

310 320 330 340 350 360 

orf 2 9a . pep ARQWADAHPNITATAQTALAVAXAATTVWGGKKVELNPTKWDWVKNTGYXTPAVRTMHTL 

IMIMIMI IMIIIIM: II III 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 Ihl hll 

orf 2 9 - 1 AKQWADAHPNITATAQTALSAAEAAGTVWRGKKVELNPTKWDWVKNTGYKKPAARHMQTL 
30 310 320 330 340 350 360 

370 380 390 400 410 420 

orf 2 9a. pep DGEMAGGNRPPKSITSNSKADASTQPSLQAQLIGEQIXXGHAYNKHVIRQQEFTDLNINS 

I I I I 1 I I I : I lh II I | 
or f 2 9 - 1 DGEMAGGNKP I KS LP - NS AAEKRKQNFEKFNSNWS S AS FDS VHKTLT PNAPG ILSPDKVK 

35 370 380 390 400 410 

Homology with a predicted ORF from N. gonorrhoeae 

ORF29 (SEP ID NO: 164) shows 88.8% identity over a 125aa overlap with a predicted ORF 
(ORF29.ng) (SEP ID NO: 170) from N. gonorrhoeae: 



or f 2 9 . pep VSPVLPITHERTGFEGVIGYETHFSGHGHE 3 0 

40 M M Ml 1 1 1 1 1 II M 1 1 Ml I II I M I 

orf29ng E PGGKYHL FGNARGS VKNRVCAVQTFDATAVGP I LP I THERTGFEGVI GYETHFSGHGHE 102 

orf 2 9 . pep VHSPFDHHDSKSTSDFSGGVDGGFTVYQLHRTWSEIHPEDEYDGPQAAXYPPPGGARDIY 90 

I II I MM I M 1 1 1 1 1 II 1 1 M II I M I II lllllll MM- lllllllllll 

orf29ng VHSPFDNHDSKSTSDFSGGVDGGFTVYQLHRTGSEIHPEDGYDGPQGGGYPPPGGARDIY 162 



45 



orf 29 .pep 



SYYVKGTSTKTKTSIVPQAPFSDRWLEENAGAASG 



125 
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l|::|||lllll : I I I I I i I I I I I : I I I I I I I I 
orf 29ng SYHIKGTSTKTKINTVPQAPFSDRWLKENAGAASGFLSRADEAGKLIWENDPDKNWRANR 222 

The complete length ORF29ng nucleotide sequence [<SEQ ID 169>] (SEP ID NO: 169) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 170>] (SEP ID NO: 170) : 

1 MNLPIQKFMM LFAAAISLLQ IPISHAN GLD ARLRDDMQAK HYEPGGKYHL 

51 FGNARGSVKN RVCAVQTFDA TAVGPILPIT HERTGFEGVI GYETHFSGHG 

101 HEVHSPFDNH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP EDGYDGPQGG 

151 GYPPPGGARD IYSYHIKGTS TKTKINTVPQ APFSDRWLKE NAGAASGFLS 

201 RADEAGKLIW ENDPDKNWRA NRMDDIRGIV QGAVNPFLTG FQGLGVGAIT 

251 DSAVSPVTYA AARKTLQGIH NLGNLSPEAQ LAAATALQDS AFAVKDSINS 

301 ARQWADAHPN ITATAQTALA VTEAATTVWG GKKVELNPAK WDWVKNTGYK 

351 KPAARHMQTV DGEMAGGNKP LESKNTVTTN NFFENTGYTE KVLRQASNGD 

401 YHGFPQSVDA FSENGTVIQI VGGDNIVRHK LYIPGSYKGK DGNFEYIREA 

4 51 DGKINHRLFV PNQQLPEK* 

In a second experiment, the following DNA sequence [<SEQ ID 17 1>] (SEP ID NO: 171) was 
identified: 



1 atgAATTTGC CTATTCAAAA ATTCATGATG ctgttggcAg cggcaatatc 

51 gat'gctGCat ATCCCCATTA GTCATGCGAA CGGTTTGGAT GCCCGTTTGC 

101 GCGATGATAT GCAGGCAAAA CACTACGAAC CGGGTGGCAA ATACCATCTG 

151 TTTGGTAATG CTCGCGGCAG TGTTAAAAAT CGGGTTTGCG CCGTCCAAAC 

201 ATTTGATGCA ACTGCGGTCG GCCCCATACT GCCTATTACA CACGAACGGA 

251 CAGGATTTGA AGGTGTTATC GGCTATGAAA CCCATTTTTC AGGACACGGA 

3 01 CACGAAGTAC ACAGTCCGTT CGATAATCAT GATTCAAAAA GCACTTCTGA 
351 TTTCAGCGGC GGCGTAGACG GCGGTTTTAC CGTTTACCAA CTTCATCGGA 

4 01 CAGGGTCGGA AATACATCCC GCAGACGGAT ATGACGGGCC TCAAGGCGGC 
4 51 GGTTATCCGG AACCACAAGG GGCAAGGGAT ATATACAGCT ACCATATCAA 
501 AGGAACTTCA ACCAAAACAA AGATAAACAC TGTTCCGCAA GCCCCTTTTT 
551 CAGACCGCTG GCTAAAAGAA AATGCCGGTG CCGCTTCCGG TTTTCTCAGC 
601 CGTGCGGATG AAGCAGGAAA ACTGATATGG GAAAACGACC CCGATAAAAA 
651 TTGGCGGGCT AACCGTATGG ATGATATTCG CGGCATCGTC CAAGGTGCGG 
701 TTAATCCTTT TTTAACGGGT TTTCAAGGGG TAGGGATTGG GGCAATTACA 
751 GACAGTGCGG TAAGCCCGGT CACAGATACA GCCGCTCAGC AGACTCTACA 
801 AGGTATTAAT GATTTAGGAA ATTTAAGTCC GGAAGCACAA CTTGCCGCCG 
851 CGAGCCTATT ACAGGACAGT GCCTTTGCGG TAAAAGACGG CATCAATTCC 
901 GCCAGACAAT GGGCTGATGC CCATCCGAAT ATAACAGCAA CAGCCCAAAC 
951 TGCCCTTGCC GTAGCAGAGG CCGCAGGTAC GGTTTGGCGC GGTAAAAAAG 

1001 TAGAACTTAA CCCGACCAAA TGGGATTGGG TTAAAAATAC CGGCTATAAA 

1051 AAACCTGCTG CCCGCCATAT GCAGACTGTA GATGGGGAGA TGGCAGGGGG 

1101 GAATAGACCG CCTAAATCTA TAACGTCGGA AGGAAAAGCT AATGCTGCAA 

1151 CCTATCCTAA GTTGGTTAAT CAGCTAAATG AGCAAAACTT AAATAACATT 

12 01 GCGGCTCAAG ATCCAAGATT GAGTCTAGCT ATTCATGAGG GTAAAAAAAA 

12 51 TTTTCCAATA GGAACTGCAA CTTATGAAGA GGCAGATAGA CTAGGTAAAA 

13 01 TTTGGGTTGG TGAGGGTGCA AGACAAACTA GTGGAGGCGG ATGGTTAAGT 
1351 AGAGATGGCA CTCGACAATA TCGGCCACCA ACAGAAAAAA AATCACAATT 

14 01 TGCAACTACA GGTATTCAAG CAAATTTTGA AACTTATACT ATTGATTCAA 
14 51 ATGAAAAAAG AAATAAAATT AAAAATGGAC ATTTAAATAT TAGGTAA 



This encodes a protein having amino acid sequence [<SEQ ID 172; PRF29ng-l>] (SEP ID NP: 
172;PRF29ng-l) : 
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1 MNLPIQKFMM LLAAAISMLH IPISHAN GLD ARLRDDMQAK HYEPGGKYHL 

51 FGNARGSVKN RVCAVQTFDA TAVGPILPIT HERTGFEGVI GYETHFSGHG 

101 HEVHSPFDNH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP ADGYDGPQGG 

151 GYPEPQGARD IYSYHIKGTS TKTKINTVPQ APFSDRWLKE NAGAASGFLS 

5 201 RADEAGKLIW ENDPDKNWRA NRMDDIRGIV QGAVNPFLTG FQGVGIGAIT 

251 DSAVSPVTDT AAQQTLQGIN DLGNLSPEAQ LAAASLLQDS AFAVKDGINS 

301 ARQWADAHPN ITATAQTALA VAEAAGTVWR GKKVELNPTK WDWVKNTGYK 

351 KPAARHMQTV DGEMAGGNRP PKSITSEGKA NAATYPKLVN QLNEQNLNNI 

401 AAQDPRLSLA IHEGKKNFPI GTATYEEADR LGKIWVGEGA RQTSGGGWLS 

10 4 51 RDGTRQYRPP TEKKSQFATT GIQANFETYT IDSNEKRNKI KNGHLNIR* 

ORF29ng-l (SEP ID NO: 172) and ORF29-1 (SEP ID NO: 166) show 86.0% identity in 401 aa 
overlap: 



10 20 30 40 50 60 

1 5 orf 2 9ng- 1 . pep MNLPIQKFMMLLAAAISMLHIPISHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKN 

I I I I I I I I I I: I I I I : h I I i I I i I Ml I I I I I I I I I I I I I M M I I I I I I I M I h 
orf 29-1 MNLP IQKFMMLFAAAI SLLQI P I SHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKK 

10 20 30 40 50 60 

70 80 90 100 110 120 

20 orf2 9ng-l .pep RVCAVQTFDATAVGPILPITHERTGFEGVIGYETHFSGHGHEVHSPFDNHDSKSTSDFSG 

II I I I II I I I I I : I : II I I I I I I I I I I I I I I I II I I I I I I I I I I I I h I I I I I I I II I I 
orf 2 9 - 1 RVYAVQTFDATAVS PVLPITHERTGFEGVIGYETHFSGHGHEVHSPFDHHDSKSTSDFSG 

70 80 90 100 110 120 

130 140 150 160 170 180 

25 orf2 9ng-l .pep GVDGGFTVYQLHRTGSEIHPADGYDGPQGGGYPEPQGARDIYSYHIKGTSTKTKINTVPQ 

MIIMIIIIIMII Mil 11111111= II I I : I II I I -I I I I I M I III 
orf 2 9-1 GVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIYSYYVKGTSTKTKTNIVPQ 

130 140 150 160 .170 180 

190 200 210 220 230 240 

30 orf 2 9ng- 1 . pep APFSDRWLKENAGAASGFLSRADEAGKLIWENDPDKNWRANRMDDIRGIVQGAVNPFLTG 

I M 1 1 1 1 1 1 1 1 1 1 1 I hi 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 I 

orf 2 9-1 APFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANRMDDVRGIVQGAVNPFLMG 

190 200 210 220 230 240 

250 260 270 280 290 300 

35 orf 2 9ng- 1 . pep FQGVG I GA I TDS AVS P VTDTAAQQTLQG INDLGNLS P EAQLAAAS LLQDS AFAVKDG I NS 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I llh I I I I I I I I I I I I I I M I I I I I II I I 
orf 2 9 - 1 FQGVG I GAI TDS AVS P VTDTAAQQTLQG INDLGfCLS PEAQLAAAS LLQDS AFAVKDG INS 

250 260 270 280 290 300 

310 320 330 340 350 360 

40 orf 2 9ng- 1 . pep ARQWADAHPN I TATAQTALAVAEAAGTVWRGKKVELNPTKWDWVKNTGYKKPAARHMQTV 

h I I I I I I II I I I . I I I I I - I I I I I I I I I I I I I I I ■! I I II I I I I I I I I I I I I I I M h 
or f 2 9 - 1 AKQWADAHPNITATAQTALSAAEAAGTVWRGKKVELNPTKWDWVKNTGYKKPAARHMQTL 

310 320 330 340 350 360 

370 380 390 400 410 419 

45 orf 2 9ng-l .pep DGEMAGGNRP PKS I - TSEGKANAATYPKLVNQLNEQNLNN I AAQDPRLSLA I HEGKKNFP 

Mlllllhl Ih =1 =: =: h - : ::: = = 

orf 2 9-1 DGEMAGGNKPIKSLPNSAAEKRKQNFEKFNSNWSSASFDSVHKTLTPNAPGILS PDKVKT 

370 380 390 400 410 420 



50 



420 430 440 450 460 470 479 

orf 2 9ng- 1 . pep IGTATYEEADRLGKI WGEGARQTSGGGWLSRDGTRQYRPPTEKKSQFATTGIQANFETY 
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orf 2 9 - 1 RYTS LDGKI T 1 1 KDNENNYFR IHDNSRKQYLDSNGNAVKTGNLQGKQAKDYLQQQTH I RN 

430 440 450 460 470 480 

Based on this analysis, including the presence of a putative leader sequence in the gonococcal 
5 protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, 
could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 21 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 173>] (SEP ID 
NO: 173) : 

10 1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATTGCCCC 

51 CGCAATGGCA AACGGCTTGG ACAATCAGGC ATTTGAAGAC CAAATGTTCC 
101 ACACGCGGGC AGATGCACCG ATGCAG . . . 

This corresponds to the amino acid sequence [<SEQ ID 174; ORF30>] (SEP ID NO: 174; 
15 ORF30) : 

1 MKKQITAAVM MLSMIAPAMA NGLDNQAFED QMFHTRADAP MQ . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 175>] (SEP ID NO: 175) : 



20 



25 



30 



1 


ATGAAAAAAC 


AAATCACCGC 


AGCCGTAATG 


ATGCTGTCTA 


TGATTGCCCC 


51 


CGCAATGGCA 


AACGGCTTGG 


ACAATCAGGC 


ATTTGAAGAC 


CAAGTGTTCC 


101 


ACACGCGGGC 


AGATGCACCG 


ATGCAGTTGG 


CGGAGCTTTC 


TCAAAAGGAG 


151 


ATGAAGGAGA 


CAGAGGGGGC 


GTTTCTTCCA 


TTGGCTATCT 


TGGGTGGTGC 


201 


TGCCATTGGT 


ATGTGGACAC 


AGCATGGTTT 


TAGTTATGCA 


ACGACAGGCA 


251 


GACCAGCTTC 


TGTTAGAGAT 


GTTGCTATTG 


CTGGCGGATT 


AGGCGCAATT 


301 


CCTGGTGGTG 


TAGGCGCCGC 


AGGAAAGGTT 


GTTTCCTTTG 


CTAAATATGG 


351 


ACGTGAGATT 


AAAATCGGCA 


ATAATATGCG 


GATAGCCCCT 


TTCGGTAATA 


401 


GAACAGGTCA 


TCCTATTGGA 


AAATTTCCCC 


ATTATCATCG 


TCGAGTTACG 


451 


GATAATACGG 


GCAAGACTTT 


GCCTGGACAG 


GGAATTGGTC 


GTCATCGCCC 


501 


TTGGGAATCA 


AAATCTACGG 


ACAGATCATG 


GAAAAACCGC 


TTCTAA 


This corresponds to the amino acid sequence [<SEQ ID 176; 


ORF30-1>1 (SEO ID NO: 176; 



ORF30-1) : 

1 MKKQITAAVM MLSMIAPAMA NGLDNQAFED QVFHTRADAP MQLAELSQKE 

51 MKETE GAFLP LAILGGAAIG MW TQHGFSYA TTGRPASVRD VAIAGGLGAI 

35 101 PGGVGAAGKV VSFAKYGREI KIGNNMRIAP FGNRTGHPIG KFPHYHRRVT 

151 DNTGKTLPGQ GIGRHRPWES KSTDRSWKNR F* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N.meningitidis (strain A) 
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ORF30 (SEP ID NO: 174) shows 97.6% identity over a 42aa overlap with an ORF (ORF30a) 
(SEP ID NO: 178) from strain A of N. meningitidis: 

10 20 30 40 

or f 3 0 . pep MKKQITAAVMMLSMIAPAMAN GLDNQAFEDQMFHTRADAPMQ 

1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 M 1 1 1 1 

orf 3 0a MKKQITAAVMMLSMIAPAMAN GLDNQAFEDQVFHTRADAPMQLAELSQKEMKXTX GAFLP 

10 20 30 40 50 60 

orf 30a LX I LGGAAI GMW TQHGFS YATTGRP ASVRDVAI AGGLGA I PGXVGAAGKWS FAKYGRE I 

70 80 90 100 110 120 

The complete length ORF30a nucleotide sequence [<SEQ ID 177>] (SEP ID NO: 177) is: 



1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATTGCCCC 

51 CGCAATGGCA AACGGCTTGG ACAATCAGGC ATTTGAAGAC CAAGTGTTCC 

101 ACACGCGGGC AGATGCACCG ATGCAGTTGG CGGAGCTTTC TCAAAAGGAG 

15 151 ATGAAGGANA CAGNGGGGGC GTTTCTTCCA TTGGNTATCT TGGGTGGTGC 

201 TGCCATTGGT ATGTGGACAC AGCATGGTTT TAGTTATGCA ACGACAGGCA 

251 GACCAGCTTC TGTTAGAGAT GTTGCTATTG CTGGCGGATT AGGCGCAATT 

301 CCTGGTGNTG TAGGCGCCGC AGGAAAGGTT GTTTCCTTTG CTAAATATGG 

351 ACGTGAGATT AAAATCGGCA ATAATATGCG GATAGCCCCT TTCGGTAATA 

20 401 GAACAGGTCA TCCTATTGGN AAATTTCCCC ATTATCATCG TCGAGTTACG 

451 GATAATACGG GCAAGACTTT GCCTGGACAG GGAATTGGTC GTCATCGCCC 

501 TTGGGAATCA AAATCTACGG ACAGATCATG GAAAAACCGC TTCTAA 

This encodes a protein having amino acid sequence [<SEQ ID 178>] (SEPIDNP: 178) : 

25 1 MKKQITAAVM MLSMIAPAMA NGLDNQAFED QVFHTRADAP MQLAELSQKE 

51 MKXTX GAFLP LXILGGAAIG MW TQHGFSYA TTGRPASVRD VAI AGGLGAI 

101 PGXVGAAGKV VS FAKYGRE I KIGNNMRIAP FGNRTGHPIG KFPHYHRRVT 

151 DNTGKTLPGQ GIGRHRPWES KSTDRSWKNR F* 

30 PRF30a (SEP ID NP: 178) and PRF30-1 (SEP ID NP: 176) show 97.8% identity in 181 aa 
overlap: 

orf 30a . pep MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKXTXGAFLP 60 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 1 1 ! I 1 1 1 1 1 1 1 1 1 1 I 1 1 I I 1 1 1 1 II 1 1 1 1 M 1 1 1 I Mill 

orf 30 - 1 MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP 60 

35 orf 3 0a . pep LX I LGGAAIGMWTQHGFSYATTGRPASVRDVAIAGGLGAI PGXVGAAGKWS FAKYGRE I 12 0 

I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I M I I I I I I llllllllllllll 
orf30-l LAI LGGAAI GMWTQHGFS YATTGRPASVRDVA I AGGLGAI PGGVGAAGKWS FAKYGRE I 120 

orf 30a .pep KIGNNMRIAPFGNRTGHPIGKFPHYHRRVTDNTGKTLPGQGIGRHRPWESKSTDRSWKNR 180 

I I I I I I I I I I I I I I I I I I ! I I I I I I II II I I I I I I I I I I ll'l I I I I I I I I I I I I I I I I 
40 or f 3 0 - 1 KIGNNMR I APFGNRTGHP I GKFPHYHRRVTDNTGKTLPGQG I GRHRPWES KSTDRSWKNR 180 

orf 30a. pep FX 
II 

orf 30-1 FX 



Homology with a predicted PRF from N. gonorrhoeae 
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ORF30 (SEP ID NO: 174) shows 97.6% identity over a 42aa overlap with a predicted ORF 
(ORF30.ng) (SEPIDNP: 180) from N. gonorrhoeae: 



orf30.pep MKKQ I TAAVMMLSM I AP AMANGLDNQAFEDQMFHTRADAPMQ 42 

I i I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I M 
orf30ng MKKQ I TAAVMMLSM I APAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP 60 

The complete length ORF30ng nucleotide sequence [<SEQ ID 179>] (SEP ID NO: 179) is 



1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATCGCCCC 

51 CGCAATGGCA AACGGATTGG ACAATCAGGC ATTTGAAGAC CAAGTGTTCC 

101 ACACGCGGGC AGATGCGCCG ATGCAGTTGG CGGAGCTTTC TCAGAAGGAG 

151 ATGAAGGAGA CTGAAGGGGC TTTTCTTCCA TTGGCTATCT TGGGTGGTGC 

2 01 TGCCATTGGT ATGTGGACAC AGCATGGTTT TAGTTATGCA ACGACAGGCA 

2 51 GACCAGCTTC TGTTAGAGAT GTTGCTGGCG GATTAGGCGC AATTCCTGGT 

3 01 GATGTAGGTG CTGCAGGAAA GGTTGTTTCC TTTGCTAAAT ATGGACGTGA 
351 GATTAAAATC GGCAATAATA TGCGGATAGC CCCTTTCGGT AATAGAACAG 

4 01 GTCATCCTAT TGGAAAATTT CCCCATTATC ATCGTCGAGT TACGGATAAT 
4 51 ACGGGCAAGA CTTTGCCTGG ACAGGGAATT GGTCGTCATC GCCCTTGGGA 
501 ATCAAAATCT ACGGACAGAT CATGGAAAAA CCGCTTCTAA 

This encodes a protein having amino acid sequence [<SEQ ID 180>] (SEP ID NP: 180) : 



1 MKKQITAAVM MLSMIAPAM A NGLDNQAFED QVFHTRADAP MQLAELSQKE 

51 MKETEGAFLP LAILGGAAIG MWTQHGFSYA TTGRPASVRD VAGGLGAIPG 

101 DVGAAGKWS FAKYGRE I KI GNNMRIAPFG NRTGHPIGKF PHYHRRVTDN 

151 TGKTLPGQGI GRHRPWESKS TDRSWKNRF* 

PRF30ng (SEP ID NP: 180) and PRF30-1 (SEP ID NP: 176) show 98.3% identity in 181 aa 
overlap: 



10 20 30 40 50 60 

MKKQ I TAAVMMLSM I APAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP 

II INI IIIMIIMIMIII lllllll MMIIII III MM MM IIIIIMIIIMI 

MKKQ I TAAVMMLSM I APAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP 
10 20 30 40 50 60 

70 80 90 100 110 

LA I LGGAA I GMWTQHGFS YATTGRPAS VRDVA - - GGLGA I PGD VGAAGKWS FAKYGRE I 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 Ml 1 1 1 1 llllllll 1 1 1 1 1 II I M 1 1 1 M 1 1 

LA I LGGAA I GMWTQHGFS YATTGRPAS VRDVA I AGGLG A I PGGVGAAGKWS FAKYGRE I 
70 80 90 100 110 120 

120 130 140 150 160 170 

KI GNNMR I AP FGNRTGH P I GKF PH YHRRVTDNTGKTLPGQG I GRHRPWE S KS TDRSWKNR 

ill llllllll 1 1 1 1 ll I ll 1 1 1 1 1 lllllll 

KIGNNMRIAPFGNRTGHPIGKFPHYHRRVTDNTGKTLPGQGIGRHRPWESKSTDRSWKNR 
130 140 150 160 170 180 

180 
FX 

II 
FX 



orf 30ng.pep 
orf30-l 

orf 30ng.pep 
orf30-l 

orf 30ng.pep 
orf30-l 

orf 30ng.pep 
orf30-l 
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Based on this analysis, including the presence of a putative leader sequence in the gonococcal 
protein, it is predicted that the proteins from N.meningitidis and N. gonorrhoeae, and their epitopes, 
could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 22 

5 The following partial DNA sequence was identified in N.meningitidis [<SEQ ID 181 >] (SEP ID 
NO: 181) : 

1 ATGAATAAAA CTCTCTATCG TGTAATTTTC AACCGCAAAC GTGGGGCTGT 

51 GrTAGCCGTT GCTGAAACTA CCAAGCGCGA AGGTAAAAGC TGTGCCGATA 

101 GTGATTCAGG CAGCGCTCAT GTGAAATCTG TTCCTTTTGG TACTACTCAT 

10 151 GCACCTGTTT GTg . CGTTaC AAATATCTTT TCTTTTTCTT TATTGGGCTT 

2 01 TTCTTTATGT TTGGCTGTAG GtacGGyCAA TATTGCTTTT GCTGATGGCA 

251 TT. . 

This corresponds to the amino acid sequence [<SEQ ID 182; ORF31>] (SEP ID NO: 182; 
15 PRF31) : 

1 MNKTLYRVIF NRKRGAVXAV AETTKREGKS CADSDSGSAH VKSVPFGTTH 

51 APVCXVTNIF SFSLLGFSLC LAVGTXNIAF ADGI . . 

Further work revealed a further partial nucleotide sequence [<SEQ ID 183>] (SEPIDNP: 183) : 

20 1 ATGAATAAAA CTCTCTATCG TGTAATTTTC AACCGCAAAC GTGGGGCTGT 

51 GGTAGCCGTT GCTGAAACTA CCAAGCGCGA AGGTAAAAGC TGTGCCGATA 

101 GTGATTCAGG CAGCGCTCAT GTGAAATCTG TTCCTTTTGG TACTACTCAT 

151 GCACCTGTTT GTCGTTCAAA TATCTTTTCT TTTTCTTTAT TGGGCTTTTC 

2 01 TTTATGTTTG GCTGTAGGTA CGGCCAATAT TGCTTTTGCT GATGGCATT . . 



25 



30 



This corresponds to the amino acid sequence [<SEQ ID 184; PRF31-1>] (SEP ED -NO: 184; 
PRF31-1V . 

1 MNKTLYRVIF NRKRGAWAV AETTKREGKS CADSDSGSAH VKSVPFGTTH 
51 APVCRSNIFS FSLLGFSLCL AVGTANIAFA DGI . . 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted PRF from N. gonorrhoeae 

PRF31 (SEP ID NP: 182) shows 76.2% identity over a 84aa overlap with a predicted PRF 
(PRF31 .ng) (SEP ID NP: 186) from N. gonorrhoeae: 



35 orf 31 .pep MNKTLYRVI FNRKRGAVXAVAETTKREGKSCADSDSGSAHVKSVPFGTTHAPVCXVTNI F 60 
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I I I I I I I I I M I I I I I I I i II I I I I I I I I I I |||::|||| I II :: | 

or f 3 lng MNKTLY RVI FNRKRGAWAVAETTKREGKS CADSGSGS VYVKS VS F I PTH SKAF 5 4 

orf 3 1 . pep SFSLLGFSLCLAVGTXNIAFADGI 84 

II lllllllhll llllllll' 
orf 3 lng CFSALGFSLCLALGTVNIAFADGIITDKAAPKTQQATILQTGNGIPQVNIQTPTSAGVSV 114 

The complete length ORF31ng nucleotide sequence [<SEQ ID 185>] (SEP ID NO: 185) is: 

1 ATGAACAAAA CCCTCTATCG TGTGATTTTC AACCGCAAAC GCGGTGCTGT 

51 GGTAGCTGTT GCCGAAACCA CCAAGCGCGA AGGTAAAAGC TGTGCCGATA 

101 GTGGTTCGGG CAGCGTTTAT GTGAAATCCG TTTCTTTCAT TCCTACTCAT 

151 TCCAAAGCCT TTTGTTTTTC TGCATTAGGC TTTTCTTTAT GTTTGGCTTT 

201 GGGTACGGTC AATATTGCTT TTGCTGACGG CATTATTACT GATAAAGCTG 

251 CTCCTAAAAC CCAACAAGCC ACGATTCTGC AAACAGGTaa cGGCATACCG 

3 01 CAAGTCAATA TTCAAACCCC TACTTCGGCA GGGGTTTCTG TTAATCAATA 
351 TGCCCAGTTT GATGTGGGTA ATCGCGGGGC GATTTTAAAC AACAGTCGCA 

4 01 GCAACACCCA AACACAGCTA GGCGGTTGGA TTCAAGGCAA TCCTTGGTTG 
4 51 ACAAGGGGCG AAGCACGTGT GGTTGTAAAC CAAATCAACA GCAGCCATCC 
501 TTCACAACTG AATGGCTATA TTGAAGTGGG TGGACGACGT GCAGAAGTCG 
551 TTATTGCCAA TCCGGCAGGG ATTGCAGTCA ATGGTGGTGG TTTTATCAAT 
601 GCTTCCCGTG CCACTTTGAC GACAGGCCAA CCGCAATATC AAGCAGGAGA 
651 CTTTAGCGGC TTTAAGATAA GGCAAGGCAA TGCTGTAATC GCCGGACACG 
701 GTTTGGATGC CCGTGATACC GATTTCACAC GTATTCTTGT ATGCCAACAA 
751 AATCACCTTG ATCAGTACGG CCGAACAAGC AGGCATTCGT AA 

This encodes a protein having amino acid sequence [<SEQ ID 1 86>] (SEP ID NO: 186) : 

1 MNKTLYRVIF NRKRGAWAV AETTKREGKS CADSGSGSVY VKSVSFIPTH 
51 SKAFCFSALG FSLCLALGTV NIAFADGI IT DKAAPKTQQA TILQTGNGIP 
101 QVNIQTPTSA GVSVNQYAQF DVGNRGAILN NSRSNTQTQL GGWIQGNPWL 
151 TRGEARVWN QINSSHPSQL NGYIEVGGRR AEWIANPAG IAVNGGGFIN 
201 ASRATLTTGQ PQYQAGDFSG FKIRQGNAVI AGHGLDARDT DFTRILVCQQ 
251 NHLDQYGRTS RHS* 

This gonococcal protein shares 50% identity over a 149aa overlap with the pore-forming 
hemolysins-like HecA protein (SEP ID NO: 1 125) from Erwinia chrysanthemi (accession number 
L39897): 

orf 3 lng 96 GNG I PQVN I QTPTS AGVS VNQYAQFDVGNRGAI LNNSRSN - TQTQLGGW I QGNPWLTRGE 154 

GNG+P VNI TP ++G+S N+Y F+V NRG ILNN + T +QLGG IQ NP L 
HecA 45 GNGVPWNIATPDASGLSHNRYHDFNVDNRGLILNNGTARLTPSQLGGLIQNNPNLNGRA 104 

Orf31ng 155 ARVWNQINSSHPSQLNGYIEVGGRRAEWIANPAGIAVNGGGFINASRATLTTGQPQYQ 214 

A ++N++ S + S+L GY+EV G+ A W+ANP GI +G GF+N R TLTTG PQ+ 
HecA 105 AAAILNEWSPNRSRLAGYLEVAGQAANVWANPYGITCSGCGFLNTPRLTLTTGTPQFD 164 

Orf 3 lng 215 - AGD FS G F K I RQGN AV I AGHGLD ARDTDF 242 

AG SG +R G+ +1 G GLDA +D+ 
HecA 165 AAGGLS GLD VRGGD I L I DGAGLDAS RS D Y 193 

Furthermore, PRF31ng (SEP ID NP: 186) and PRF31-1 (SEP ID NP: 184) show 79.5% identity 



in 83 aa overlap: 
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10 20 30 40 50 60 

or f 3 1 - 1 . pep MNKTLYRVIFNRKRGAWAVAETTKREGKSCADSDSGSAHVKSVPFGTTHAPVCRSNIFS 

I I M I I I II I I I I I I I I M I I I I I I I M I I hhh II I || hi 
or f 3 lng MNKTLYRVI FNRKRGAWAVAETTKREGKSCADSGSGS VYVKSVS F I PTH SKAFC 

10 20 30 40 50 

70 80 
orf 31-1 .pep FSLLGFSLCLAVGTANIAFADGI 

II I I I I I I hi hi I I I I I I I 
orf 3 lng FSALGFSLCLALGTVNIAFADGI I TDKAAPKTQQAT I LQTGNGI PQVN IQTPTS AGVSVN 

60 70 80 90 100 110 

On this basis, including the homology with hemolysins, and also with adhesins, it is predicted that 
the proteins from ^meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens 
for vaccines or diagnostics, or for raising antibodies. 

Example 23 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 1 87>] (SEP ID 
NO: 187) : 

1 ATGAATACTC CTCCTTTTGT CTGTTGGATT TTTTGCAAGG TCATCGACAA 

51 TTTCGGCGAC ATCGGCGTTT CGTGGCGGCT CGCCCGTGTT TTGCACCGCG 

101 AACTCGGTTG GCAGGTGCAT TTGTGGACGG ACGATGTGTC CGCCTTGCGT 

151 GCGCTTTGCC CTGATTTGCC CGATGTTCCC TGCGTTCATC AGGATATTCA 

201 TGTCCGCACT TGGCATTCCG ATGCGGCAGA TATTGATACC GCG. . 

This corresponds to the amino acid sequence [<SEQ ID 188; ORF32>] (SEP ID NO: 188; 
PRF32): 



1 MNTPPFVCWI FCKVIDNFGD IGVSWRLARV LHRELGWQVH LWTDDVSALR 
51 ALCPDLPDVP CVHQDIHVRT WHSDAADIDT A. . 

Further work revealed the complete nucleotide sequence [<SEQ ID 1 89>] (SEPIDNP: 189) : 



1 ATGAATACTC CTCCTTTTGT CTGTTGGATT TTTTGCAAGG TCATCGACAA 

51 TTTCGGCGAC ATCGGCGTTT CGTGGCGGCT CGCCCGTGTT TTGCACCGCG 

101 AACTCGGTTG GCAGGTGCAT TTGTGGACGG ACGATGTGTC CGCCTTGCGT 

151 GCGCTTTGCC CTGATTTGCC CGATGTTCCC TGCGTTCATC AGGATATTCA 

201 TGTCCGCACT TGGCATTCCG ATGCGGCAGA TATTGATACC GCGCCTGTTC 

251 CCGATGTCGT CATCGAAACT TTTGCCTGCG ACCTGCCCGA AAATGTGCTG 

301 CACATTATCC GCCGACACAA GCCGCTTTGG CTGAATTGGG AATATTTGAG 

3 51 CGCGGAGGAA AGCAATGAAA GGCTGCATCT GATGCCTTCG CCGCAGGAGG 

4 01 GTGTTCAAAA ATATTTTTGG TTTATGGGTT TCAGCGAAAA AAGCGGCGGG 
4 51 TTGATACGCG AACGTGATTA CTGCGAAGCC GTCCGTTTCG ATACTGAAGC 
501 CCTGCGAGAG CGGCTGATGC TGCCCGAAAA AAACGCCTCC GAATGGCTGC 
551 TTTTCGGCTA TCGGAGCGAT GTTTGGGCAA AGTGGCTGGA AATGTGGCGA 
601 CAGGCAGGCA GCCCGATGAC ACTGTTGCTG GCGGGGACGC AAATCATCGA 
651 CAGCCTCAAA CAAAGCGGCG TTATTCCGCA AGATGCCCTG CAAAACGACG 
701 GCGATGTTTT TCAGACGGCA TCCGTCCGCC TCGTCAAAAT CCCTTTCGTG 
751 CCGCAACAGG ACTTCGACCA ACTGCTGCAC CTTGCCGACT GCGCCGTCAT 
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801 CCGCGGCGAA GACAGTTTCG 

851 TTTGGCACAT CTACCCGCAA 

901 GCCTTTTGGG ATAAGGCACA 

951 ACACCGCCGT CTTTCGGACG 

1001 CACAACGCCT CGAATGTTGG 

1051 CGGCAAGGCG CGGAGGATTG 

1101 TCCTGAAAAA CTCGCTGCCT 



TGCGCGCCCA GCTTGCGGGC AAACCCTTCT 
GACGAGAATG TCCATCTCGA CAAACTCCAC 
CGGTTTCTAC ACGCCCGAAA CCGTGTCGGC 
ACCTCAACGG CGGAGAGGCT TTATCCGCAA 
CAAACCCTGC AACAACATCA AAACGGCTGG 
GAGCCGTTAT CTTTTCGGGC AGCCGTCAGC 
TTGTTTCAAA GCATCAAAAA ATACGCTAG 



This corresponds to the amino acid sequence [<SEQ ID 190; ORF32-l>] (SEP ID NO: 190; 
PRF32-1) : 



1 MNTPPFVCWI FCKVIDNFGD IGVSWRLARV LHRELGWQVH LWTDDVSALR 

51 ALCPDLPDVP CVHQDIHVRT WHSDAADIDT APVPDWIET FACDLPENVL 

101 HIIRRHKPLW LNWEYLSAEE SNERLHLMPS PQEGVQKYFW FMGFSEKSGG 

151 LIRERDYCEA VRFDTEALRE RLMLPEKNAS EWLLFGYRSD VWAKWLEMWR 

201 QAGSPMTLLL AGTQIIDSLK QSGVIPQDAL QNDGDVFQTA SVRLVKIPFV 

251 PQQDFDQLLH LADCAVIRGE DSFVRAQLAG KPFFWHIYPQ DENVHLDKLH 

3 01 AFWDKAHGFY TPETVSAHRR LSDDLNGGEA LSATQRLECW QTLQQHQNGW 

351 RQGAEDWSRY LFGQPSAPEK LAAFVSKHQK IR*w 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N. meningitidis (strain A) 



ORF32 (SEP ID NO: 1 88) shows 93.8% identity over a 81 aa overlap with an ORF (ORF32a) 
rSEOIDNO: 192) from strain A of TV. meningitidis: 



10 20 30 40 50 60 

orf 32 . pep MNTP PFVCW I FCKV I DNFGD IGVSWRLARVLHRELGWQVHLWTDDVS ALRALCPDLPDVP 

Mill! MM i 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 32a MNTPPFSAGXFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVX 

10 20 30 40 50 60 



70 80 
orf 32 .pep CVHQD I HVRTWHS DAAD I DTA 

MM! I i ' I f ■ 1 1 1 1 1 1 1 

orf 32a CVHQDIHVRTWHSDAADIDTAPVXDWIETFACDLPENVLHIIRRHKPLWLXWEYLSAEX 

70 80 90 100 110 120 



The complete length PRF32a nucleotide sequence [<SEQ ID 191>] (SEPIDNP: 191) is: 



1 ATGAATACTC CTCCTTTTTC TGCTGGANTT TTTTGCAAGG TCATCGACAA 

51 TTTCGGCGAC ATCGGCGTTT CGTGGCGGCT TGCCCGTGTT TTGCACCGCG 

101 AACTCGGTTG GCAGGTGCAT TTGTGGACGG ACGATGTGTC CGCCTTGCGT 

151 GCGCTTTGCC CTGATTTGCC CGATGTTCNC TGCGTTCATC AGGATATTCA 

201 TGTCCGCACT TGGCATTCCG ATGCGGCAGA TATTGATACC GCGCCTGTTC 

251 NCGATGTCGT CATCGAAACT TTTGCCTGCG ACCTGCCCGA AAATGTGCTG 

301 CACATCATCC GCCGACACAA GCCGCTTTGG CTGAANTGGG AATATTTGAG 

3 51 CGCGGAGGAN AGCAATGAAA GGCTGCACNT GATGCCTTCG CCGCAGGAGA 

4 01 GTGTTCNAAA ATANTTTTGG TTTATGGGTT TCAGCGAANN NAGCGGCGGA 
4 51 CTGATACGCG AACGCGATTA CTGCGAAGCC GTCCGTTTCG ATAGCGGAGC 
501 CTTGCGCAAG AGGCTGATGC TTCCCGAAAA AAACGNCCCC GAATGGCTGC 
551 TTTTCGGCTA TCGGAGCGAT GTTTGGGCAA AGTGGCTGGA AATGTGGCGA 
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601 CAGGCAGGCA GTCCGTTGAC ACTTTTGCTG GCNGGGGCGC ANATTATCGA 

651 CAGCCTCAAA CAAAACGGCG TTATTCCGCA AGATGCCCTG CAAAACGACG 

701 GCGATGTTTT TCAGACGGCA TCCGTCCGCC TCGTCAAAAT CCCTTTCGTG 

751 CCGCAACAGG ACTTCGACAA ACTGCTGCAC CTTGCCGACT GCGCCGTCAT 

5 801 CCGCGGCGAA GACAGTTTCG TGCGCGCCCA GCTTGCGGGC AAACCCTTCT 

851 " TTTGGCACAT CTACCCGCAA GATGAGAATG TCCATCTCGA CAAACTCCAC 

901 GCCTTTTGGG ATAAGGCACA CGGTTTCTAC ACGCCCGAAA CCGCATCGGC 

951 ACACCGCCGC CTTTCAGACG ACCTCAACGG CGGAGAGGCT TTATCCGCAA 

1001 CACAACGCCT CGAATGTTGG CAAATCCTGC AACAACATCA AAACGGCTGG 

10 1051 CGGCAAGGCG CGGAGGATTG GAGCCGTTAT CTTTTTGGGC AGCCTTCCGC 

1101 ATCCGAAAAA CTCGCCGCCT TTGTTTCAAA GCATCAAAAA ATACGCTAG 



This encodes a protein having amino acid sequence [<SEQ ID 192>] (SEP ID NO: 192) : 



1 MNTPPFSAGX FCKVIDNFGD IGVSWRLARV LHRELGWQVH LWTDDVSALR 

15 51 ALCPDLPDVX CVHQDIHVRT WHSDAADIDT APVXDWIET FACDLPENVL 

101 HIIRRHKPLW LXWEYLSAEX SNERLHXMPS PQESVXKXFW FMGFSEXSGG 

151 LIRERDYCEA VRFDSGALRK RLMLPEKNXP EWLLFGYRSD VWAKWLEMWR 

201 QAGSPLTLLL AGAXIIDSLK QNGVIPQDAL QNDGDVFQTA 'SVRLVKIPFV 

251 PQQDFDKLLH LADCAVIRGE DSFVRAQLAG KPFFWHIYPQ DENVHLDKLH 

20 3 01 AFWDKAHGFY TPETASAHRR LSDDLNGGEA LSATQRLECW QILQQHQNGW 

351 RQGAEDWSRY LFGQPSASEK LAAFVSKHQK IR* 

ORF32a (SEP ID NO: 192) and ORF32-1 (SEP ID NO: 190) show 93.2% identity in 382 aa 



overlap: 



25 10 20 30 40 50 60 

or f 3 2 - 1 . pep MNTPPFVCWI FCKVIDNFGD IGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVP 

I II II I 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 , 1 ! I 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 

orf32a MNTPPFSAGXFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVX 

10 20 30 40 50 60 



30 70 80 90 100 110 120 

orf 32 - 1 . pep CVHQD I HVRTWHSDAAD I DTAP VPDW I ETFACDLPENVLH I IRRHKPLWLNWEYLSAEE 

I II 1 1 1 1 Ml 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 . 1 II I II II I II 1 1 1 1 1 llllll! 

orf 32a CVHQD I HVRTWHSDAAD IDTAPVXDWI ETFACDLPENVLH 1 1 RRHKPLW LXWEYLSAEX 

70 80 90 100 110 120 

35 130 140 150 160 170 180 

orf 32 - 1 . pep SNERLHLMPSPQEGVQKYFWFMGFSEKSGGLIRERDYCEAVRFDTEALRERLMLPEKNAS 

llllll IMIIhl I MINIM 1 1 II I i 1 1 1 M 1 1 1 1 1 1 = IIMIIIIII 

orf 32a SNERLHXMPS PQESVXKXFWFMGFSEXSGGL I RERDYCEAVRFDSGALRKRLMLPEKNXP 

130 140 150 160 170 180 

40 190 200 210 220 230 240 

orf 32-1. pep EWLLFGYRSDVWAKWLEMWRQAGSPMTLLLAGTQI IDSLKQSGVI PQDALQNDGDVFQTA 

I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I : I t I 1 I M = I I I I I I I II I I I I I I I 1 I 
O r f 3 2 a E WLLFG YRSDVWAKWLEMWRQAGS PLTLLLAGAX 1 1 DS LKQNGV I PQDALQNDGDVFQTA 

190 200 210 220 230 240 

45 250 260 270 280 290 300 

orf 32 - 1 . pep SVRLVKIPFVPQQDFDQLLHLADCAVIRGEDSFVRAQLAGKPFFWHI YPQDENVHLDKLH 

, I ' I I IN I I I I I I M : I I I I II I I I I I I I I I I I I I I I l : II I III I I I I I I I I I I M I 
orf 32a SVRLVKIPFVPQQDFDKLLHLADCAVIRGEDSFVRAQLAGKPFFWHIYPQDENVHLDKLH 

250 260 270 280 290 300 

50 310 320 330 340 350 360 
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or f 3 2 - 1 . pep AFWDKAHGFYTPETVS AHRRLSDDLNGGEALSATQRLECWQTLQQHQNGWRQGAEDWSRY 

IIIMIIIIIIIMIIIIIIIIIIMII INI Mill MIIIIIIIIIIIIIIM 

or f 3 2 a AFWDKAHGFYTPETASAHRRLSDDLNGGEALSATQRLECWQ I LQQHQNGWRQGAEDWSRY 

310 320 330 340 350 360 



orf 32-1 .pep 



orf 32a 



370 380 
LFGQPSAPEKLAAFVSKHQKIRX 

IIMIII MMMMMIMM 

LFGQPSAS EKLAAFVSKHQKI RX 
370 380 



Homology with a predicted ORF from N. gonorrhoeae 



ORF32 (SEP ID NO: 188) shows 95.1% identity over a 82aa overlap with a predicted ORF 
(ORF32.ng) (SEP ID NO: 194) from N. gonorrhoeae: 



orf 32 . pep MNTPPF- VCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLP 57 

Ml I I I I II I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I I I I I I I M I 

orf 3 2ng MVMNTYAFPVCW I FCKV IDNFGD I GVS WRLARVLHRELGWQVHLWTDDVS ALRALCPDLP 6 0 

orf 32. pep DVPCVHQDIHVRTWHSDAADIDTA 81 

III I I M I II I I I I I I II M I I I 

orf32ng DVPFVHQDIHVRTWHSDAADIDTAPVPDAVIETFACDLPENVLNIIRRHKPLWLNWEYLS 12 0 

An PRF32ng nucleotide sequence [<SEQ ID 193>] (SEP ID NP: 193) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 194>] (SEPIDNP: 194) : 



1 MVMNTYAFPV CWIFCKVIDN FGDIGVSWRL ARVLHRELGW QVHLWTDDVS 

51 ALRALCPDLP DVPFVHQDIH VRTWHSDAAD IDTAPVPDAV IETFACDLPE 

101 NVLNIIRRHK PLWLNWEYLS AEESNERLHL MPSPQEGVQK YFWFMGFSEK 

151 SGGLIRERDY REAVRFDTEA LRRRLVLPEK NAPEWLLFGY RGDVWAKWLD 

2 01 MWQQAGSLMT LLLAGAQI ID SLKQSGVIPQ NALQNEGGVF QTASVRLVKI 

2 51 PFVPQQDFDK LLHLADCAVI RGEDSFVRTQ LAGKPFFWHI YPQDENVHLD 

301 KLHAFWDKAY GFYTPETASV HRLLSDDLNG GEALSATQRL ECGVL* 



Further sequencing revealed the following DNA sequence [<SEQ ID 195>] (SEPIDNP: 195) : 



1 ATGAATACAT ACGCTTTTCC 

51 CAATTTCGGC GACATCGGCG 

101 GCGAACTCGG TTGGCAGGTG 

151 CGCGCGCTTT GTCCCGATTT 

201 TCATGTCCGC ACTTGGCATT 

251 TTCCCGATGC CGTTATCGAA 

301 CTGAACATCA TCCGCCGACA 

3 51 GAGCGCGGAG GAAAGCAATG 

4 01 AGGGCGTTCA AAAATATTTT 
4 51 GGGTTGATAC GCGAACGCGA 
501 AGCCCTGCGC CGGCGGCTGG 
551 TGCTTTTCGG CTATCGGGGC 
601 CAACAGGCAG GCAGCCTGAT 
651 CGACAGCCTC AAACAAAGCG 
701 aaggcgGTGT CTTTCagacG 
751 GTGCcGCAAC AGGAcTTCGA 
801 GATACGCGGC GAAGACAGTT 



TGTCTGTTGG ATTTTTTGCA AGGTCATCGA 
TTTCGTGGCG GCTCGCCCGT GTTTTGCACC 
CATTTGTGGA CGGACGACGT GTCCGCCTTG 
GCCCGATGTT CCCTTCGTTC ATCAGGATAT 
CCGATGCGGC AGACATTGAT ACCGCGCCCG 
ACTTTTGCCT GCGACCTGCC CGAAAATGTG 
CAAACCGCTT TGGCTGAATT GGGAATATTT 
AAAGGCTGCA CCTGATGCCT TCGCCGCAGG 
TGGTTTATGG GTTTCAGCGA AAAAAGCGGC 
TTACCGCGAA GCCGTCCGTT TCGATACCGA 
TGCTGCCCGA AAAAAACGCC CCCGAATGGC 
GATGTTTGGG CAAAGTGGCT GGACATGTGG 
GACCCTACTG CTGGCGGGGG CGCAAATTAT 
GCGTTATTCC GCAAAACGCC CTGCAAAAtg 
gcatccgTcC gccttGTCAA AAtcCCGTTC 
CAAATTGCTG CAcctcgcCG ACTGCGCCGT 
TCGTGCGTAC CCAGCTTGCC GGAAAACCCT 
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851 TTTTTTGGCA CATCTACCCG CAAGACGAGA ATGTCCATCT CGACAAACTC 
901 CACGCCTTTT GGGATAAGGC ATACGGCTTC TACACGCCCG AAACCGCATC 
951 GGTGCACCGC CTCCTTTCGG ACGACCTCAA CGGCGGAGAG GCTTTATCCG 

1001 CAACACAACG CCTCGAATGT TGGCAAACCC TGCAACAACA TCAAAACGGC 
5 1051 TGGCGGCAAG GCGCGGAGGA TTGGAGCCGT TATCTTTTCG GGCAGCCTTC 

1101 CGCATCCGAA AAACTCGCCG CCTTTGTTTC AAAGCATCAA AAAATACGCT 

1151 AG 

This encodes a protein having amino acid sequence [<SEQ ID 196; ORF32ng-l>] (SEP ID NO: 
10 196; PRF32ng-l) : 



1 MNTYAFPVCW IFCKVIDNFG DIGVSWRLAR VLHRELGWQV HLWTDDVSAL 

51 RALCPDLPDV PFVHQDIHVR TWHSDAADID TAPVPDAVIE TFACDLPENV 

101 LNIIRRHKPL WLNWEYLSAE ESNERLHLMP SPQEGVQKYF WFMGFSEKSG 

151 GLIRERDYRE AVRFDTEALR RRLVLPEKNA PEWLLFGYRG DVWAKWLDMW 

15 201 QQAGSLMTLL LAGAQIIDSL KQSGVIPQNA LQNEGGVFQT ASVRLVKIPF 

251 VPQQDFDKLL HLADCAVIRG EDSFVRTQLA GKPFFWHIYP QDENVHLDKL 

301 HAFWDKAYGF YTPETASVHR LLSDDLNGGE ALSATQRLEC WQTLQQHQNG 

■ 351 WRQGAEDWSR YLFGQPSASE KLAAFVSKHQ KIR* 

20 ORF32ng-l (SEP ID NO: 196) and ORF32-1 (SEP ID NO: 190) show 93.5% identity in 383 aa 
overlap: 



25 



10 20 30 40 50 59 

orf 32 - 1 . pep MNTPPF- VCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDV 

Ml I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M LI I I I I I II I I I 
orf32ng-l MNTYAFPVCW I FCKVIDNFGD I GVSWRLARVLHRELGWQVHLWTDDVSAL RALCPDLPDV 

10 20 30 40 50 60 
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60 70 80 90 100 110 119 

orf 3 2 - 1 . pep P CVHQD I HVRTWHSDAAD I DTAP VPD W I ET FACDL P ENVLH 1 1 RRHKPLWLNWE YLS AE 

I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 hi I ! 1 1 1 1 M M 1 1 h 1 1 1 1 1 1 1 1 M I 1 1 M ! 

orf32ng-l PFVHQD I HVRTWHSDAAD I DTAPVPDAV I ETFACDLPENVLN 1 1 RRHKPLWLNWE YLS AE 

70 80 90 100 110 120 



35 



120 130 140 150 160 170 179 

orf 32- 1 . pep ESNERLHLMPSPQEGVQKYFWFMGFSEKSGGLIRERDYCEAVRFDTEALRERLMLPEKNA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1: 1 1 1 II 1 1 1 1 1 , 1 II 1 1 1 1 1 M M : M = Mill 

orf 3 2ng- 1 ESNERLHLMP SPQEGVQKYFWFMGFSEKSGGL I RERDYREAVRFDTEALRRRLVLPEKNA 

130 140 150 160 170 180 



40 



180 190 200 210 220 230 239 

or f 3 2 - 1 . pep SEWLLFGYRSDVWAKWLEMWRQAGSPMTLLLAGTQI IDSLKQSGVI PQDALQNDGDVFQT 
I I I I I I I M I I i I I I :! M I i II I I I I h I I I I I I I I M I I I h I I h ' I I I I 
orf32ng-l PEWLLFGYRGDVWAKWLDMWQQAGSLMTLLLAGAQI IDSLKQSGVI PQNALQNEGGVFQT 

190 200 210 220 230 240 



45 



240 250 260 270 280 290 299 

orf32-l .pep ASVRLVKIPFVPQQDFDQLLHLADCAVIRGEDSFVRAQLAGKPFFWHIYPQDENVHLDKL 

I I I I I I I I I I I I I I I h II I Ml I I I I I I I I I I hll I I I I I I . I I I I I I I I I I I I I 
orf32ng-l ASVRLVKIPFVPQQDFDKLLHLADCAVIRGEDSFVRTQLAGKPFFWHIYPQDENVHLDKL 

250 260 270 280 290 300 



300 310 320 330 340 350 359 

orf 3 2 - 1 . pep HAFWDKAHGFYTPETVSAHRRLSDDLNGGEALSATQRLECWQTLQQHQNGWRQGAEDWSR 
I M I i I : I ! I M M : 1 : I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
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orf 32ng-l HAFWDKAYGFYTPETASVHRLLSDDLNGGEALSATQRLECWQTLQQHQNGWRQGAEDWSR 

310 320 330 340 350 360 

360 370 380 

orf 32-1 .pep YLFGQPSAPEKLAAFVSKHQKIRX 

5 IMMIII IMIIIMIIIII 

or f 3 2 ng - 1 YLFGQPS AS EKLAAFVS KHQKI RX 

370 380 

On this basis, including the RGD sequence in the gonococcal protein, characteristic of adhesins, it 
is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
10 useful antigens for vaccines or diagnostics, or for raising antibodies. 

ORF32-1 (SEP ID NO: 190) (42kDa) was cloned in pET and pGex vectors and expressed in 
E.coli, as described above. The products of protein expression and purification were analyzed by 
SDS-PAGE. Figure 7A shows the results of affinity purification of the His-fusion protein, and 
Figure 7B shows the results of expression of the GST-fusion in Exoli. Purified His-fusion protein 
15 was used to immunise mice, whose sera were used for ELISA, giving a positive result. These 
experiments confirm that ORF32-1 (SEP ID NO: 190) is a surface-exposed protein, and that it is a 
useful immunogen. 

Example 24 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 197>] fSEP ID 
20 NP: 197) : 

1 . . TTGTTCCTGC GTGTNAAAGT GGGGCGTTTT TTCAGCAGTC CGGCGACGTG 
51 GTTTCGGGNC AAAGACCCTG TAAATCAGGC GGTGTTGCGG CTGTATNCGG 
101 ACGAGTGGCG GCA . ACTTCG GTACGTTGGA AAATAGNCGC AACGTCGCAC 
151 AGCCTGTGGC TCTGCACGCT GCTCGGAATG CTGGTGTCGG TATTGTTGCT 
25 2 01 GCTTTTGGTG CGGCAATATA CGTTCAACTG GGAAAGCACG CTGTTGAGCA 

2 51 ATGCCGCTTC GGTACGCGCG GTGGAAATGT TGGCATGGCT GCCGTCGAAA 
301 CTCGGTTTCC CTGTCCCCGA TGCGCGGTCG GTCATCGAAG GCCGTCTGAA 

3 51 CGGCAATATT GCCGATGCGC GGGCTTGGTC GGGGCTGCTG GTCGNCAGTA 

4 01 TCGCCTGCTA NGGCATCCTG CCGCGCCTG. . 



30 



This corresponds to the amino acid sequence [<SEQ ID 198; PRF33>] fSEO ID NO: 198; 
ORF33): 



35 



i 

51 
101 



. LFLRVKVGRF FSSPATWFRX KDPVNQAVLR LYXDEWRXTS VRWKIXATSH 
SLWLCTLLGM LVSVLLLLLV RQYTFNWEST LLSNAASVRA VEMLAWLPSK 
LGFPVPDARS VIEGRLNGNI ADARAWSGLL VXSIACXGIL PRL . . 
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Further work revealed the complete nucleotide sequence [<SEQ ID 199>] (SEP ID NO: 199) : 



1 ATGTTGAATC CATCCCGAAA ACTGGTTGAG CTGGTCCGTA TTTTGGACGA 

51 AGGCGGTTTT ATTTTCAGCG GCGATCCCGT ACAGGCGACG GAGGCTTTGC 

101 GCCGCGTGGA CGGCAGTACG GAGGAAAAAA TCATCCGTCG GGCGGAGATG 

151 ATTGACAGGA ACCGTATGCT GCGGGAGACG TTGGAACGTG TGCGTGCGGG 

201 GTCGTTCTGG TTGTGGGTGG TGGCGGCGAC GTTTGCATTT TTTACCGGTT 

251 TTTCAGTCAC TTATCTTCTA ATGGACAATC AGGGTCTGAA TTTCTTTTTG 

301 GTTTTGGCGG GCGTGTTGGG CATGAATACG CTGATGCTGG CAGTATGGTT 

351 GGCAATGTTG TTCCTGCGTG TGAAAGTGGG GCGTTTTTTC AGCAGTCCGG 

401 CGACGTGGTT TCGGGGCAAA GACCCTGTAA ATCAGGCGGT GTTGCGGCTG 

451 TATGCGGACG AGTGGCGGCA ACCTTCGGTA CGTTGGAAAA TAGGCGCAAC 

501 GTCGCACAGC CTGTGGCTCT GCACGCTGCT CGGAATGCTG GTGTCGGTAT 

551 TGTTGCTGCT TTTGGTGCGG CAATATACGT TCAACTGGGA AAGCACGCTG 

601 TTGAGCAATG CCGCTTCGGT ACGCGCGGTG GAAATGTTGG CATGGCTGCC 

651 GTCGAAACTC GGTTTCCCTG TCCCCGATGC GCGGGCGGTC ATCGAAGGCC 

701 GTCTGAACGG CAATATTGCC GATGCGCGGG CTTGGTCGGG GCTGCTGGTC 

751 GGCAGTATCG CCTGCTACGG CATCCTGCCG CGCCTGCTGG CTTGGGTAGT 

801 GTGTAAAATC CTTTTGAAAA CAAGCGAAAA CGGATTGGAT TTGGAAAAGC 

851 CCTATTATCA GGCGGTCATC CGCCGCTGGC AGAACAAAAT CACCGATGCG 

901 GATACGCGTC GGGAAACCGT GTCCGCCGTT TCACCGAAAA TCATCTTGAA 

951 CGATGCGCCG AAATGGGCGG TCATGCTGGA GACCGAGTGG CAGGACGGCG 

1001 AATGGTTCGA GGGCAGGCTG GCGCAGGAAT GGCTGGATAA GGGCGTTGCC 

1051 ACCAATCGGG AACAGGTTGC CGCGCTGGAG ACAGAGCTGA AGCAGAAACC 

1101 GGCGCAACTG CTTATCGGCG TGCGCGCCCA AACTGTGCCG GACCGCGGCG 

1151 TGTTGCGGCA GATTGTCCGA CTCTCGGAAG CGGCGCAGGG CGGCGCGGTG 

12 01 GTGCAGCTTT TGGCGGAACA GGGGCTTTCA GACGACCTTT CGGAAAAGCT 
1251 GGAACATTGG CGTAACGCGC TGGCCGAATG CGGCGCGGCG TGGCTTGAGC 

13 01 CTGACAGGGC GGCGCAGGAA GGGCGTTTGA AAGACCAATA A 

This corresponds to the amino acid sequence [<SEQ ID 200; ORF33-l>] (SEP ID NO: 200: 
ORF33-1) : 



1 MLNPSRKLVE LVRILDEGGF IFSGDPVQAT EALRRVDGST EEKIIRRAEM 

51 IDRNRMLRET LERVRAGS FW LWWAATFAF FTGFS VTYLL MDNQGLNFFL 

101 VLAGVLGMNT LMLAV WLAML FLRVKVGRFF SSPATWFRGK DPVNQAVLRL 

151 YADEWRQPSV RWKIGATSHS LW LCTLLGML VSVLLLLLVR QYTFNWESTL 

201 LSNAASVRAV EMLAWLPSKL GFPVPDARAV IEGRLNGNIA DARAWSG LLV 

251 GSIACYGILP RLLA WWCKI LLKTSENGLD LEKPYYQAVI RRWQNKITDA 

301 DTRRETVSAV SPKIILNDAP KWAVMLETEW QDGEWFEGRL AQEWLDKGVA 

351 TNREQVAALE TELKQKPAQL LIGVRAQTVP DRGVLRQIVR LSEAAQGGAV 

4 01 VQLLAEQGLS DDLSEKLEHW RNALAECGAA WLEPDRAAQE GRLKDQ* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N. meningitidis (strain A) 



ORF33 (SEP ID NO: 198) shows 90.9% identity over a 143aa overlap with an ORF (ORF33a) 
(SEP ID NO: 202) from strain A of N. meningitidis: 



10 20 30 

or f 3 3 . pep LFLRVKVGRFFSSPATWFRXKDPVNQAVLR 

1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 MINIMI 
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orf33a LMDNQGLNF FLV^GVXGMNTLMIJ^W LAMLFLRVKVGRFFSSPATWFRGKDPWQAVLR 
90 100 110 120 130 140 

40 50 60 70 80 90 

or f 3 3 . pep LYXDEWRXTSVRWKIXATSHSLW LCTLLGMLVSVLLLLLVR QYTFNWESTLLSNAASVRA 

II Mill II II II I I I M I I Nil II : I ' II M II II II II I I I I I I I h :-l II 
orf33a LYADEWRXPSVRWKIGATSHSLW LCTLLGMLVSVLLLLLVR QYTFNWESTLLGDSSSVRL 
150 160 170 180 190 200 



100 110 120 130 140 

orf 33 . pep VEMLAWL P S KLGFP VPDARS V I EGRLNGN I AD ARAWS G LL VXS I ACXG I L PRL 

I I I I I I ■ hll M I I I I I hi I I M I I I M I I I I M I I I I I I I llllll 
orf 33a VEMLAWLPAKLGFPVPDARAVIEGRLNGNIADARAWSG LLVGSIACYGILPRLLA WAVCK 
210 220 230 240 250 260 

orf 33a ILXXTSENGLDLEKXXXXXXIRRWQNKITDADTRRETVSAVSPKIVLNDAPKWAVMLETE 
270 280 290 300 310 320 



The complete length ORF33a nucleotide sequence [<SEQ ID 201 >] (SEP ID NO: 201) is 



1 ATGTTGAATC CATCCCGAAA ACTGGTTGAG CTGGTCCGTA TTTTGGAAGA 

51 AGGCGGCTTT ATTTTCAGCG GCGATCCCGT GCAGGCGACG GAGGCTTTGC 

101 GCCGCGTGGA CGGCAGTACG GAGGAAAAAA TCATCCGTCG GGCGAAGATG 

151 ATCGACAGGA ACCGTATGCT GCGGGAGACG TTGGAACGTG TGCGTGCGGG 

201 GTCGTTCTGG TTGTGGGTGG CGGCGGCGAC GTTTGCGTTT NTTACCGNTT 

251 TTTCAGTTAC TTATCTTCTA ATGGACAATC AGGGTCTGAA TTTCTTTTTG 

301 GTTTTGGCGG GCGTGNTGGG CATGAATACG CTGATGCTGG CAGTATGGTT 

351 GGCAATGTTG TTCCTGCGCG TGAAAGTGGG GCGTTTTTTC AGCAGTCCGG 

4 01 CGACGTGGTT TCGGGGCAAA GACCCTGTCA ATCAGGCGGT GTTGCGGCTG 

451 TATGCGGACG AGTGGCGGCN ACCTTCGGTA CGTTGGAAAA TAGGCGCAAC 

501 GTCGCACAGC CTGTGGCTCT GCACGCTGCT CGGAATGCTG GTGTCGGTAT 

551 TGTTGCTGCT TTTGGTGCGG CAATATACGT TCAACTGGGA AAGCACGCTG 

601 TTGGGCGATT CGTCTTCGGT ACGGCTGGTG GAAATGTTGG CATGGCTGCC 

651 TGCGAAACTG GGTTTTCCCG TGCCTGATGC GCGGGCGGTC ATCGAAGGTC 

701 GTCTGAACGG CAATATTGCC GATGCGCGGG CTTGGTCGGG GCTGCTGGTC 

751 GGCAGTATCG CCTGCTACGG CATCCTGCCG CGCCTCTTGG CTTGGGCGGT 

801 ATGCAAAATC CTTNTGNAAA CAAGCGAAAA CGGCTTGGAT TTGGAAAAGC 

851 NCNNNNNTCN NNCGNTCATC CGCCGCTGGC AGAACAAAAT CACCGATGCG 

901 GATACGCGTC GGGAAACCGT GTCCGCCGTT TCGCCGAAAA TCGTCTTGAA 

951 CGATGCGCCG AAATGGGCGG TCATGCTGGA GACCGAATGG CAGGACGGCG 

1001 AATGGTTCGA GGGCAGGCTG GCGCAGGAAT GGCTGGATAA GGGCGTTGCC 

1051 GCCAATCGGG AACAGGTTGC CGCGCTGGAG ACAGAGCTGA AGCAGAAACC 

1101 GGCGCAACTG CTTATCGGCG TGCGCGCCCA AACTGTGCCC GACCGCGGCG 

1151 TGTTGCGGCA GATCGTCCGA CTTTCGGAAG CGGCGCAGGG CGGCGCGGTG 

1201 GTGCANCTTT TGGCGGAACA GGGGCTTTCA GACGACCTTT CGGAAAAGCT 

1251 GGAACATTGG CGTAACGCGC TGACCGAATG CGGCGCGGCG TGGCTGGAAC 

1301 CCGACAGAGC GGCGCAGGAA GGCCGTCTGA AAACCAACGA CCGCACTTGA 

This encodes a protein having amino acid sequence [<SEQ ID 202>] (SEP ID NO: 202) : 



1 MLNPSRKLVE LVRILEEGGF IFSGDPVQAT EALRRVDGST EEKIIRRAKM 

51 IDRNRMLRET LERVRAGS FW LWVAAATFAF XTXFS VTYLL MDNQGLNFFL 

101 VLAGVXGMNT LMLAVW LAML FLRVKVGRFF SSPATWFRGK DPVNQAVLRL 

151 YADEWRXPSV RWKIGATSHS LW LCTLLGML VSVLLLLLV R QYTFNWESTL 

201 LGDSSSVRLV EMLAWLPAKL GFPVPDARAV I EGRLNGN I A DARAWS GLLV 

251 GSIACYGILP RLLA WAVCKI LXXTSENGLD LEKXXXXXXI RRWQNKITDA 

301 DTRRETVSAV SPKIVLNDAP KWAVMLETEW QDGEWFEGRL AQEWLDKGVA 

351 ANREQVAALE TELKQKPAQL LIGVRAQTVP DRGVLRQIVR LSEAAQGGAV 
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4 01 VXLLAEQGLS DDLSEKLEHW RNALTECGAA WLEPDRAAQE GRLKTNDRT* 

ORF33a (SEP ID NO: 202) and ORF33-1 (SEP ID NO: 200) show 94.1% identity in 444 aa 
overlap: 



5 10 20 30 40 50 60 

or f 3 3a. pep MLNPSRKLVELVRILEEGGFIFSGDPVQATEALRRVDGSTEEKIIRRAKMIDRNRMLRET 

Mill MMMMMMMMMMMMMMIMMMMI MMIIMMM I 

orf 3 3 - 1 MLNPSRKLVELVRILDEGGFIFSGDPVQATEALRRVDGSTEEKI I RRAEM I DRNRMLRET 

10 20 30 40 50 60 



10 70 80 90 100 110 120 

or f 3 3a . pep LERVRAGSFWLWVAAATFAFXTXFSVTYLLMDNQGLNFFLVLAGVXGMNTI^LAWIJ^L 

MINI IIIIIMIII I MIMMII MMIMIMI MMIMIMI I 

or f 3 3 - 1 LERVRAGSFWLWWAATFAFFTGFSVTYLLMDNQGLNFFLVLAGVLGMNTLMLAVWLAML 

70 80 90 100 110 120 



15 



130 140 150 160 170 180 

orf 3 3a . pep FLRVKVGRFFSSPATWFRGKDPVNQAVLRLYADEWRXPSVRWKIGATSHSLWLCTLLGML 

1 1 IIMI Mill 1 1 1 Mill I II IIIIIMIMI I II 1 1 II I M I II 1 1 1 1 II I M 

orf 33 - 1 FLRVKVGRFFSSPATWFRGKDPVNQAVLRLYADEWRQPSVRWKIGATSHSLWLCTLLGML 

130 140 150 160 170 180 



20 



190 200 210 220 230 240 

orf 3 3a . pep VSVLLLLLVRQYTFNWESTLLGDSSSVRLVEMLAWLPAKLGFPVPDARAVIEGRLNGNIA 

Mill MIMIIIM MM-MM II MIMMIMM IIIIMIMIIII 

orf 3 3 - 1 VSVLLLLLVRQYTFNWESTLLSNAASVRAVEMLAWLPSKLGFPVPDARAVIEGRLNGNIA 

190 200 210 220 230 240 



25 



250 260 270 280 290 300 

orf 33a . pep DARAWSGLLVGSIACYGILPRLLAWAVCKILXXTSENGLDLEKXXXXXXIRRWQNKITDA 

I 1 1 1 Mill III 1 1 1 1 1 1 MMMIMM MIMMII IIIIMIIIII 

orf 3 3 - 1 DARAWSGLLVGSIACYGILPRLLAWWCKILLKTSENGLDLEKPYYQAVIRRWQNKITDA 

250 260 270 280 290 300 



30 



310 320 330 340 350 360 

orf 33a . pep DTRRETVSAVSPKIVLNDAPKWAVMLETEWQDGEWFEGRLAQEWLDKGVAANREQVAALE 

II IIMI I II MMM I MM lllllll MMIM I MM 1 1 III MMMMIMM 

orf 33-1 DTRRETVSAVSPKI ILNDAPKWAVMLETEWQDGEWFEGRLAQEWLDKGVATNREQVAALE 

310 320 330 340 350 360 



35 



370 380 390 400 410 420 

orf 33a . pep TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAWXLLAEQGLSDDLSEKLEHW 

1 1 II M I 1 1 M I M II II II 1 1 I M I II 1 1 M M 1 1 1 1 1 llllllllllllllllll 

orf 33 - 1 TELKQKPAQLL I GVRAQTVPDRGVLRQ I VRLSEAAQGGAWQLLAEQGLS DDLSEKLEHW 

370 380 390 400 410 420 



40 



430 440 450 

or f 3 3 a . pep RNALTECGAAWLEPDRAAQEGRLKTNDRTX 

I I I M I I I II II M I I I I I I M 
orf 33 - 1 RNALAECGAAWLEPDRAAQEGRLKDQX 

430 440 



45 Homology with a predicted ORF from N. gonorrhoeae 
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ORF33 (SEP ID NO: 198) shows 91.6% identity over a 143aa overlap with a predicted ORF 
(ORF33.ng) (SEP ID NO: 204) from N. gonorrhoeae: 

orf33.pep LFLRVKVGRFFSSPATWFRXKDPVNQAVLR 30 

IMIIII IIIMIIIMI I llllllll 

orf 33ng iJviDNQGLNFFLVLAGVLGMNTLMIiAVWIiATLFLRVCTGRFFSSPATWFRGKGPWQAVLR 100 

orf 33 . pep LYXDEWRXTSVRWKIXATSHSLWLCTLLGMLVSVLLLLLVRQYTFNWESTLLSNAASVRA 90 

II hll I I I I I I I h I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I 
orf33ng LYADQWRQPSVRWKIGATAHSLWLCTLLGMLVSVLLLLLVRQYTFNWESTLLSNAASVRA 160 

orf 33 .pep VEMLAWLPSKLGFPVPDARSVIEGRLNGNIADARAWSGLLVXSIACXGILPRL 143 

III IIIIIIIMIII Ihlllllll I lillllllll Ihl MINI 

orf33ng VEMLAWLPSKLGFPVPDARAVIEGRLNGNIADARAWSGLLVGSIVCYGILPRLLAWWCK 220 

An ORF33ng nucleotide sequence [<SEQ ID 203>] (SEP ID NO: 203) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 204>] (SEP ID NO: 204) : 



1 MIDRDRMLRD TLERVRAGS F WLWVWASMM FTAGFS GTYL LMDNQGLNFF 

51 LVLAGVLGMN TLMLAVWLAT LFLRVKVGRF FSSPATWFRG KGPVNQAVLR 

101 LYADQWRQPS VRWKIGATAH SLW LCTLLGM LVSVLLLLLV RQYTFNWEST 

151 LLSNAASVRA VEMLAW LPSK LGFPVPDARA VIEGRLNGNI ADARAWSGLL 

2 01 VGSIVCYGIL PRLLAWWCK ILLKTSENGL DLEKTYYQAV IRRWQNKITD 
251 - ADTRRETVSA VSPKIVLNDA PKWALMLETE WQDGQWFEGR LAQEWLDKGV 

3 01 AANREQVAAL ETELKQKPAQ LLIGVRAQTV PDRGVLRQIV RLSEAAQGGA 
351 WQLLAEQGL SDDLSEKLEH WRNALTECGA AWLEPDRVAQ EGRLKDQ* 

Further sequence analysis revealed the following DNA sequence [<SEQ ID 205>] (SEP ID NO: 
205): 



1 


ATGTTGaatC 


CATCCCgaAA 


51 


agggggtrrr 


attttcagcg 


101 


gccgcgtgga 


cggcAGTACG 


151 


atcgACAGGg 


accgtatgtt 


201 


gtcgtTctgG 


TTATGGGTGG 


251 


TTTCAGgcac 


ttatCttCTG 


301 


GTTTTggcgG 


GAGTGTtggG 


351 


gGCAACGTTG 


TTCCTGCGCG 


401 


CGACGTGGTT 


TCGGGGCAAA 


451 


TATGCGGACC 


AGTGGCGGCA 


501 


GGCGCACAGC 


TTGTGGCTCT 


551 


TGCTGCTGCT 


TTTGGTGCGG 


601 


TTGAGCAATG 


CCGCTTCGGT 


651 


GTCGAAACTC 


GGTTTCCCTG 


701 


GTCTGAACGG 


CAATATTGCC 


751 


GGCAGTATCG 


TCTGCTACGG 


801 


GTGTAAAATC 


CTTTTGAAAA 


851 


CCTATTATCA 


GGCGGTCATC 


901 


GATACGCGTC 


GGGAAACCGT 


951 


CGATGCGCCG 


AAATGGGCGC 


1001 


AATGGTTCGA 


GGGCAGGCTG 


1051 


GCCAATCGGG 


AACAGGTTGC 


1101 


GGCGCAACTG 


CTTATCGGCG 


1151 


TGCTGCGGCA 


GATTGTGCGG 



ACTGgttgag ctGgTCCgtA Ttttgaataa 
gcgatcctgt gcaggcgacg gaggctttgc 
GAggAaaaaa tcttccgtcg GGCGGAGAtg 
gcgggACaCg TtggaacGTG TGCGTGCggg 
TggtggCAtC gATGATGTtt aCCGCCGGAT 
ATGGACaatC AGGGGCtGAA TtTCTTTTTA 
CATGaatacG ctgATGCTGG CAGTATGGtt 
TGAAAGTGGG ACGGTTTTTC AGCAGTCCGG 
GGCCCTGTAA ATCAGGCGGT GTTGCGGCTG 
ACCTTCGGTA CGATGGAAAA TAGGCGCAAC 
GCACGCTGCT CGGAATGCTG GTGTCGGTAT 
CAATATACGT TCAACTGGGA AAGCACGCTG 
ACGCGCGGTG GAAATGTTGG CATGGCTGCC 
TCCCCGATGC GCGGGCGGTC ATCGAAGGTC 
GATGCGCGGG CTTGGTCGGG GCTGCTGGTC 
CATCCTGCCG CGCCTCTTGG CTTGGGTAGT 
CAAGCGAAAA CGGattgGAT TTGGAAAAAA 
CGCCGCTGGC AGAACAAAAT CACCGATGCG 
GTCCGCCGTT TCGCcgaAAA TCGTCTTGAA 
TCATGCTGGA GACCGAGTGG CAGGACGGCC 
GCGCAGGAAT GGCTGGATAA GGGCGTTGCC 
CGCGCTGGAG ACAGAGCTGA AGCAGAAACC 
TACGCGCCCA AACTGTGCCG GACCGGGGCG 
CTTTCGGAAG CGGCGCAGGG CGGCGCGGTG 
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12 01 GTGCAGCTTT TGGCGGAACA GGGGCTTTCA GACGACCTTT CGGAAAAGCT 
12 51 GGAACATTGG CGTAACGCGC TGACCGAATG CGGCGCGGCG TGGCTTGAGC 
1301 CTGACAGGGT GGCGCAGGAA GGCCGTTTGA AAGACCAATA A 

This encodes a protein having amino acid sequence [<SEQ ID 206; ORF33ng-l>] (SEQ ID NO: 
206;ORF33ng-l): 



10 



15 



1 MLNPSRKLVE 

51 IDRDRMLRDT 

101 VLAGVLGMNT 
151 



LVRILNKGGF 
LERVRAGSFW 
LMLAVWLATL 



I FSGDPVQAT 
LWVWASMMF 



EALRRVDGST 
TAGFSGTYLL 



YADQWRQPSV 
LSNAASVRAV 
GSIVCYGILP 



201 

251 

301 DTRRETVSAV 

351 ANREQVAALE 

401 VQLLAEQGLS 



RWKIGATAHS 
EMLAWLPSKL 
RLLAWWCKI 



FLRVKVGRFF 
LWLCTLLGML 



SSPATWFRGK 
VSVLLLLLVR 



SPKIVLNDAP 
TELKQKPAQL 
DDLSEKLEHW 



GFPVPDARAV 
LLKTSENGLD 
KWALMLETEW 
LIGVRAQTVP 
RNALTECGAA 



IEGRLNGNIA 
LEKTYYQAVI 
QDGQWFEGRL 
DRGVLRQIVR 
WLEPDRVAQE 



EEKIFRRAEM 
MDNQGLNFFL 
GPVNQAVLRL 
QYTFNWESTL 
DARAWSG LLV 
RRWQNKITDA 
AQEWLDKGVA 
LSEAAQGGAV 
GRLKDQ* 



ORF33ng-l (SEP ID NO: 206) and PRF33-1 (SEP ID NO: 200) show 94.6% identity in 446 aa 
overlap: 



20 



10 20 30 40 50 60 

orf 3 3 - 1 . pep MLNPSRKLVELVRILDEGGFIFSGDPVQATEALRRVDGSTEEKI I RRAEM I DRNRMLRET 

hhhhhhlhhhhlilllhllhh I I I lh hhllhlllh 

orf33ng-l MLNPSRKLVELVRILNKGGFIFSGDPVQATEALRRVDGSTEEKIFRRAEMIDRDRMLRDT 

10 20 30 40 50 60 



25 



70 80 90 100 110 120 

orf 3 3 - 1 . pep LERVRAGSFWLVmAAATFAFFTGFSVTYLLMDNQGLNFFLVIAGVLGMNTLMLAWLAML 

I I I I I M I I I I I I I M - I :||| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 3 3ng- 1 LERVRAGSFWLWVWASMMFTAGFSGTYLLMDNQGLNFFLVLAGVLGMNTLMLAVWLATL 

70 80 90 100 110 120 



30 



130 140 150 160 170 180 

orf 3 3 - 1 . pep FLRVKVGRFFSSPATWFRGKDPVNQAVLRLYADEWRQPSVRWKIGATSHSLWLCTLLGML 

I M 1 1 1 1 1 1 M II 1 1 1 1 M 1 1 1 1 1 ! I 1 1 hi I M I ' 1 ,1 1 1 1 1 : 1 M 1 1 1 II I II I 

orf33ng-l FLRVKVGRFFSSPATWFRGKGPVNQAVLRLYADQWRQPSVRWKIGATAHSLWLCTLLGML 

130 140 150 160 170 180 



35 



190 200 210 220 230 240 

orf 3 3 - 1 . pep VSVLLLLLVRQYTFNWESTLLSNAASVRAVEMLAWLPSKLGFPVPDARAVIEGRLNGNIA 

IIIIIIIIIIIIIMIIIIIIIIIMIIIIIIIIMIIIMIIIIMIIIIIIIIIIIM 

orf 33ng-l VSVLLLLLVRQYTFNWESTLLSNAASVRAVEMLAWLPSKLGFPVPDARAVIEGRLNGNIA 

190 200 210 220 230 240 



40 



250 260 270 280 290 300 

orf 33-1 .pep DARAWSGLLVGSIACYGILPRLLxAWWCKILLKTSENGLDLEKPYYQAVIRRWQNKITDA 

lllllllhhlhhlil hi hllll I llllllll 1 1 1 1 1 1 1 1 1 1 Ml I 

orf 33ng-l DARAWSGLLVGSIVCYGILPRLLAWWCKILLKTSENGLDLEKTYYQAVIRRWQNKITDA 

250 260 270 280 290 300 



310 320 330 340 350 360 

45 orf 33 - 1 . pep DTRRETVSAVSPKI ILNDAPKWAVMLETEWQDGEWFEGRLAQEWLDKGVATNREQVAALE 

Ml lllllhlllhllh I hi I llhhllllllllhhhlhlllllllll 

orf33ng-l DTRRETVSAVSPKI VLNDAPKWALMLETEWQDGQWFEGRLAQEWLDKGVAANREQVAALE 

310 320 330 340 350 360 
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370 380 390 400 410 420 

orf 33 - 1 . pep TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAWQLLAEQGLSDDLSEKLEHW 

1 1 1 1 1 1 1 1 1 1 1 ! 1 1 ! 1 1 1 1 1 1 1 1 1 i E 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 M I f I 

orf 33ng-l TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAWQLLAEQGLSDDLSEKLEHW 

370 380 390 400 ' 410 420 

430 • 440 

or f 3 3 - 1 . pep RNALAECGAAWLEPDRAAQEGRLKDQX 

Illhllllllllllhllllllllll 
or f 3 3 ng - 1 RNALTECGAAWLEPDRVAQEGRLKDQX 

430 440 

Based on the presence of several putative transmembrane domains in the gonococcal protein, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 25 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ED 207>] (SEP ID 
NO: 207): 



i 

51 
101 
151 
201 
251 

301; 

351 
401 
451 



. CAGAAGAGTT 
CGGGGTGTCC 
CCTGTTTTTC 
GGCAGTACGG 
CGTCCGGCTG 
CCCGGTTTTT 
TCTGTGCCGT 
GGGTTGGGCG 
GTTTCGCGGG 
GTCC . . 



TGTCGAGAAT 
GGTCTGGTAT 
GGGTGTTTCT 
GGGTTTCTTT 
CCTGTCGGTT 
CTTGGGTGCG 
CCGGCTGTGC 
GCATCTTGTT 
GGCTGTCGGT 



TTCTTTATGG 
GGTTTTCTTT 
TTTCGGGGTT 
GAGTGTGTTT 
TGAGCTGTGT 
GCAGGGGACG 
GGGTTCGGAT 
CCGACTACGC 
GTGTTGCGGT 



GGTTTGGGCG 
GGGCGTTTCT 
CGGGACGGGG 
TCAGCTTGTG 
CGGCAGGTTG 
TCATTCTCCT 
GAGGCGGCGT 
CGTTTGGCAG 
TCGGCTTGAA 



GCGTGTTTTT 
TT . GAGTGCG 
GACGTTTGTG 
TTCC . GGCGT 
CG. . GTTTGA 
GCCGCTTTCG 
GGTGGTGTTC 
CCAGAATTCG 
GGGTTTTGTC 



This corresponds to the amino acid sequence [<SEQ ID 208; ORF34>] (SEP ID NO: 208; 
PRF34): 



1 ..QKSLSRISLW GLGGVFFGVS GLVWFSLGVS XECACFSGVS FRGSGRGTFV 
51 GSTGVSLSVF SACVXGWRL PVGLSCVGRL XXLTRFFLGA AGDVILLPLS 
101 SVPSGCAGSD EAAWWCSGWA ASCPTTPFGS QNSVSRGLSV CCGSA*RVLS 
151 S.. 

Further work revealed the complete nucleotide sequence [<SEQ ID 209>] (SEP ID NP: 209) : 



1 ATGATGATGC CGTTCATAAT GCTTCCTTGG ATTGCkGGTG TGCCTGCCGT 

51 GCCGGGTCAG AATAGGTTGT CCAGAATTTC TTTATGGGGT TTGGGCGGCG 

101 TGTTTTTCGG GGTGTCCGGT TTGGTATGGT TTTCTTTGGG CGTTTCTTTG 

151 GGCTGCGCCT GTTTTTCGGG TGTTTCTTTT CGGGGTTCGG GACGGGGGAC 

201 GTTTGTGGGC AGTACGGGGG TTTCTTTGAG TGTGTTTTCA GCTTGTGTTC 

251 CGGCGTCGTC CGGCTGCCTG TCGGTTTGAG CTGTGTCGGC AGGTTGCGGT 

301 TTGACCCGGT TTTTCTTGGG TGCGGCAGGG GACGGCAGTC CGCTGCCGCT 

351 TTCGTCTGTG CCGTCCGGCT GTGCGGGTTC GGATGAGGCG GCGTGGTGGT 
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401 


GTTCGGGTTG 


GGCGGCATCT 


451 


TCGGTTTCGC 


GGGGGCTGTC 


501 


GTCGCCGTTC 


GGGTTGAATG 


551 


TGGCGGCGAT 


ACAGATGAGC 


601 


AGCCTGAAGG 


GTTTGTTCGG 


651 


GTGTCGGGCA 


ATGCCGTCTG 


701 


CGTTGGACGT 


AGTTTTGGTA 


751 


GGTGCTGACT 


TTTTGGGTAA 


801 


CCATAACGTA 


GGTTACGTTG 


851 


GTGGCGGGGC 


TGATGCCCAA 


901 


AGTGTCGCCG 


GCGATGTCGC 


951 


TATAGTTGTA 


CACGCCTTCG 


1001 


ACGAACTGTT 


TTTCGCCTTC 


1051 


GCGGTTGTAG 


CCGACGACGG 


1101 


TGTTTTGGCG 


CAGATAGGAA 


1151 


ACGTTGTCGT 


CGGTTTGCGC 


1201 


CGCGCCGACG 


GCGGCGCTTC 


1251 


CAAGGCTGAA 


AATGGCGGCA 


1301 


TTCATCGGGT 


GCTTCCTTTC 


1351 


CATGCCGTCT 


GA 



TGTCCGACTA CGCCGTTTGG CAGCCAGAAT 
GGTGTGTTGC GGTTCGGCTT GAAGGGTTTT 
TGCTGACGAT GCCTATTGCC AATGCGCCGA 
AATACGGCGC GTATCAGGAG TTTGGGGGTC 
TTTTTTTGCC ATTTTGATTG TGCTTTTGGG 
AAGGCGGTTC AGACGGCATT GCCGAGTCAG 
GAGGGTGATG ACTTTTTGTA CGCCGACGGT 
TCTGCGCCTG TTCTTCGGGG GTGAGGATGC 
CCGTAGGTAA CGATTTTGAC GCGCGCCTGT 
CAGCGTGGCG CGGACTTTGG ATGTGTTCCA 
CGGCAGTGCG CGGCAGGGAG GCGACGGTAA 
GCGGCCTGTT CGGAACGTGC AATCTGACCG 
GGTGGCGACT TGTCCGAGCA GCAGCAGGTG 
AGATTTGGGG CGTGTAGCCT TTGGTTTGGT 
CGGGCGGTGG TTTCGATACG CAACGCCATA 
GCCGGTGGTT CGGCGGTCGA CGGCGGATTT 
CGATTACTGC GCTGACGCAG CCGCTAAGGG 
ATCAGGGTGC GGACGGTGTG CGGTTTGGGT 
TTGGGCGTTT CAGACGGCAT TGCTTTGCGC 



This corresponds to the amino acid sequence [<SEQ ID 210; ORF34-l>] (SEP ID NO: 210; 
ORF34-1) : 



1 MMMPFIMLPW IAGVPAVPGQ NRLSRISLWG LGGVFFGVSG LVWFSLGVSL 
51 GCACFSGVSF RGSGRGTFVG STGVSLSVFS ACVPASSGCL SV*AVSAGCG 
101 LTRFFLGAAG DGSPLPLSSV PSGCAGSDEA AWWCSGWAAS CPTTPFGSQN 
151 SVSRGLSVCC GSA*RVLSPF GLNVLTMPIA NAPMAAIQMS NTARIRSLGV 
201 SLKGLFGFFA ILIVLL GCRA MPSEGGSDGI AESALDWLV EGDDFLYADG 
251 GADFLGNLRL FFGGEDAHNV GYVAVGNDFD ARLCGGADAQ QRGADFGCVP 

3 01 SVAGDVAGSA RQGGDGNIW HAFGGLFGTC NLTDELFFAF GGDLSEQQQV 
351 AWADDGDLG R VAFGLWLA QIGTGGGF DT QRHNWVGLR AGGSAVDGGF 

4 01 RADGGASDYC ADAAAKGKAE NGGNQGADGV RFGFHRVLPF LGVSDGIALR 
451 HAV* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF34 (SEP ID NO: 208) shows 73.3% identity over a 161 aa overlap with an ORF (ORF34a) 
rSEOIDNO: 212) from strain A of N. meningitidis: 



10 20 30 
orf 34 .pep QKS L S R I S L WGLGG VF FGVS GL VW FS LG VSXE CAC 

II III 1 1 1 1 1 1 i MINIMUM. Ill 

orf 34a MMXPXIMLPWIAGVPAVPGQKRLSRXSLWGLGGXFFGVSGLVWFSLGVSXSLGVSXGCAC 

10 20 30 40 50 60 



40 50 60 70 80 90 
orf 34 . pep FSGV S FRGSGRG TFVGSTGVSLSVFSACVX GWRLPVGLSCVGRLXX LTRFFLGA 

I I I I I I M I I I M I II I I I I I I I I I N I:: :|: = III I II 

orf 34a FSGV S FRGSGRG TFVGSTGVSLSVFSACA PAS SGCLS VXAVS AGCGLTRXFXGA 

70 80 90 100 110 
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100 110 120 130 140 150 

orf 34 . pep AGDVILLPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLS 

Ml I I I I I I I I I I I h I I I I I I I I I I I II I I I I I I I I I I I M I I M II : I I I I 
orf 34a AGDGSPLPLSSVPSGCAGADEEAXXCSGWAASCPTTPFGSQNSVSRGLSVCCGSVWRVLS 
120 130 140 150 160 170 



orf 34. pep S 

orf 34a PFGXNVLTMPIANAPMAVIOMSNTARIRSL GVSLKGLFXFFAILIVLL GCRAMPSEGGSD 
10 180 190 200 210 220 230 

The complete length ORF34a nucleotide sequence [<SEQ ID 21 1>] (SEP ID NO: 211) is: 

1 ATGATGATNC CGTTNATAAT GCTTCCTTGG ATTGCGGGTG TGCCTGCCGT 

51 GCCGGGTCAG AAGAGGTTGT CGAGAANTTC. TTTATGGGGT TTAGGCGGCN 

101 TGTTTTTCGG GGTGTCCGGT TTGGTATGGT TTTCTTTGGG CGTTTCTNTT 

15 151 TCTTTGGGTG TTTCTNTGGG CTGTGCCTGT TTTTCGGGTG TTTCTTTTCG 

2 01 GGGTTCGGGA CGGGGGACGT TTGTGGGCAG TACNGGGGTT TCTTTGAGTG 

2 51 TGTTTTCAGC TTGTGCTCCG GCGTCGTCCG GCTGCCTGTC GGTTTNAGCT 

301 GTGTCGGCAG GTTGCGGTTT GACCCGGNTT TTCTTNGGTG CGGCAGGGGA 

351 CGGCAGTCCG CTGCCGCTTT CGTCTGTGCC GTCCGGCTGT GCGGGTGCGG 

20 401 ATGAGGAGGC GTNGTNGTGT TCGGGTTGGG CGGCATCTTG TCCGACTACG 

4 51 CCGTTTGGCA GCCAGAATTC GGTTTCGCGG GGGCTGTCGG TGTGTTGCGG 

501 TTCGGTNTGG AGGGTTTTGT CNCCGTTCGG GTNGAATGTG CTGACGATGC 

551 CTATTGCCAA TGCGCCGATG GCGGTGATAC AGATGAGCAA TACGGCGCGT 

601 ATCAGGAGTT TGGGGGTCAG CCTGAAGGGT TTGTTCNGTT TTTTTGCCAT 

25 651 TTTGATTGTG CTTTTGGGGT GTCGGGCAAT GCCGTCTGAA GGCGGTTCAG 

701 ACGGCATTGC CGAGTCAGCG TTGGACGTAG TTTNGGTAGA GGGTGATGAC 

751 TTTTTGTACG CCGACGGTGG TGCTGACTTT TTGGGTAATC TGCGCCTGTT 

801 CTTCGGGGGT GAGGATGCCC ATAACGTAGG TTACGTTGCC GTAGGTAACG 

851 ATTTTGACGC GCGCCTGTGT GGCGGGGCTG ATGCCCAACA GCGTGGCGCG 

30 901 GACTTTGGAT GTGTTCCAAG TGTCGCCGGC GATGTCGCCG GCAGTGCGCG 

951 GCAGGGAGGC GACGGTAATG TANTTGTACA CGCCTTCGGC GGCCTGTTCG 

1001 GAACGTGCAA TCTGACCGAC GAACTGTTTC TCGCCTTCGG TGGCGACTTG 

1051 TCCGAGCAGC AGCAGGTGGC GGTTGTAGCC GACAACGGAG ATTTGGGGCG 

1101 TGTANCCTTT GGTTTGGTTG TTTTGGCGCA GATAGGAGCG GGCGGTGGTT 

35 1151 TCGATACGCA GCGCCATTAC GTTGTCGTCG GTTNGCGCGC CGGTGGTTCG 

12 01 GCGGTCGACG GCGGATTTCG CGCCGACCGC CGCGCCGCCG ACGACTGCGC 

12 51 TGACGCAGCC GCCGAGGGCA AGGCTGAGGA CGGCGGCAGT CAGGGTGCGG 

1301 ACGGTGTGCG GTTTGGGTTT CATCGGGTGC TTCCTTTCTT GGGCGTTTCA 

1351 GACGGCATTG CTTTGCGCCA TGCCGTCTGA 

40 

This encodes a protein having amino acid sequence [<SEQ ID 212>] fSEOIDNO: 212) : 

1 MMXPXIMLPW IAGVPAVPGQ KRLSRXSLWG LGGXFFGVSG LVWFSLGVSX 

51 SLGVSXGCAC FSGV SFRGSG RG TFVGSTGV SLSVFSACA P ASSGCLSVXA 

101 VSAGCGLTRX FXGAAGDGSP LPLSSVPSGC AGADEEAXXC SGWAASCPTT 

45 151 PFGSQNSVSR GLSVCCGSVW RVLSPFGXNV LTMPIANAPM AVIQMSNTAR 

201 IRSL GVSLKG LFXFFAILIV LL GCRAMPSE GGSDGIAESA LDWXVEGDD 

251 FLYADGGADF LGNLRLFFGG EDAHNVGYVA VGNDFDARLC GGADAQQRGA 

301 DFGCVPSVAG DVAGSARQGG DGNVXVHAFG GLFGTCNLTD ELFLAFGGDL 

351 SEQQQVAWA DNGDLGR VXF GLWLAQIGA GGGF DTQRHY VWGXRAGGS 

50 4 01 AVDGGFRADR RAADDCADAA AEGKAEDGGS QGADGVRFGF HRVLPFLGVS 

4 51 DGIALRHAV* 

ORF34a (SEP ID NO: 212) and ORF34-1 (SEP ID NO: 210) show 91.3% identity in 459 aa 



overlap: 
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10 20 30 40 50 60 

orf 34a . pep MMXPXIMLPWIAGVPAVPGQKRLSRXSLWGLGGXFFGVSGLVWFSLGVSXSLGVSXGCAC 

II I II 1 1 1 1 M M I M 1 1 M M I I III 1 1 1 1 1 i II M 1 1 1 1 II II 

orf 34 - 1 MMMPFIMLPWIAGVPAVPGQNRLSRISLWGLGGVFFGVSGLVWFSLGVSL GCAC 

5 10 20 30 40 50 

70 80 90 100 110 120 

orf 34a . pep FSGVSFRGSGRGTFVGSTGVSLSVFSACAPASSGCLSVXAVSAGCGLTRXFXGAAGDGSP 

MM IIIIIIIIIIIIIIIIIIIIM : IMMMMIMIIIIIII I 1 1 1 1 1 1 1 

orf 34 - 1 FSGVS FRGSGRGTFVGSTGVSLSVFSACVPASSGCLSVXAVSAGCGLTRFFLGAAGDGSP 

10 60 70 80 90 100 110 

130 140 150 160 170 180 

orf 34a . pep LPLSSVPSGCAGADEEAXXCSGWAASCPTTPFGSQNSVSRGLSVCCGSVWRVLSPFGXNV 

1111111)1111 = 11 I M II MM I M 1 1 1 II II 1 1 1 1 1 1 1 1 1 M I M 1 1 1 1 II 

orf 34 - 1 LPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLSPFGLNV 
15 120 130 140 150 160 170 

190 200 210 220 230 240 

orf 34a . pep LTMPIANAPMAVIQMSNTARIRSLGVSLKGLFXFFAILIVLLGCRAMPSEGGSDGIAESA 

II I MMIMMMMIMM I I Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I II I II I II 

orf 34 - 1 LTMPIANAPMAAIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSDGIAESA 
20 180 190 200 210 220 230 

250 260 270 280 290 300 

or f 3 4 a . pep LDWXVEGDDFLYADGGADFLGNLRLFFGGEDAHNVGYVAVGNDFDARLCGGADAQQRGA 

1 1 1 1 1 1 1 1 1 1 M I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II I II 1 1 

or f 3 4 - 1 LDWLVEGDDFLYADGGADFLGNLRLFFGGEDAHNVGYVAVGNDFDARLCGGADAQQRGA 
25 240 250 260 270 280 290 

310 320 330 340 350 360 

orf 34a . pep DFGCVPSVAGDVAGSARQGGDGNVXVHAFGGLFGTCNLTDELFLAFGGDLSEQQQVAWA 

I MIMMMIMMMIMM MIMMIMMMIIMM I llllllllll 

orf 34 - 1 DFGCVPSVAGDVAGSARQGGDGNI WHAFGGLFGTCNLTDELFFAFGGDLSEQQQVAWA 

30 ' 300 310 320 330 340 350 

370 380 390 400 410 420 

orf 34a . pep DNGDLGRVXFGLWLAQIGAGGGFDTQRHYVWGXRAGGSAVDGGFRADRRAADDCADAA 

MMIMI MIIIIIIIMIII Mill 1 1 1 1 IMIIIIIIIIMI hi Mill 

orf 34 - 1 DDGDLGRVAFGLWLAQIGTGGGFDTQRHNWVGLRAGGSAVDGGFRADGGASDYCADAA 
35 360 370 380 390 400 410 

430 440 450 460 

orf 34a . pep AEGKAEDGGSQGADGVRFGFHRVLPFLGVSDGIALRHAVX 

M 1 1 MM M 1 1 1 II 1 1 1 1 1 1 1 1 1 MM I II 1 1 1 1 M M 

orf 34 - 1 AKGKAENGGNQGADGVRFGFHRVLPFLGVSDGIALRHAVX 
40 420 430 440 450 

Homology with a predicted ORF from N. gonorrhoeae 

ORF34 (SEP ID NO: 208) shows 77.6% identity over a 161 aa overlap with a predicted ORF 
(ORF34.ng) (SEP ID NO: 214) from N. gonorrhoeae: 

orf34 pep QKSLSRISLWGLGGVFFGVSGLVWFSLGVSXE CAC 35 

45 || | | || || || |:|| || || | | || | Ml | II III 

orf34ng MMMPFIMLPWIAGVPAVPGQKRLSRISLWGLAGVFFGVSGLVWFSLGVSFSLGVSLGCAC 60 
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orf 34 .pep FSGVSFRGSGRGTFVGSTGVSLSVFSACVXGWRLPVGLSCV GRLXXLTRFFLGA 

llllllllll MM MM MIIMI II MM I = II llllllll 

orf 34ng FSGVSFRGSGWGAFVGSTGVSLSVFSACVP VPVNESAARAASEGR - -GLTRFFLGA 



90 



114 



orf 34 . pep AGDVILLPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLS 

III M MM MMMM M MM MM MMM MMMMMMIMIM 1 1 1 1 

orf 34ng AGDGSPLPLSSVPSGCAGSDEAAWWCSGWAASCPTAPFGSQNSVSRGLSVCCGSVWRVLS 



150 



174 



orf 34 .pep 



175 



orf 34ng PFGLNVLTMPTANAPMAVIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSD 234 

10 The complete length ORF34ng nucleotide sequence [<SEQ ID 21 3>] (SEP ID NO: 213) is: 



1 ATGATGATGC CGTTCATAAT GCTTCCTTGG ATTGCGGGTG TGCCTGCCGT 

51 GCCGGGTCAA AAGAGGTTGT CGAGAATCTC TTTATGGGGT TTGGCCGGCG 

101 TGTTTTTCGG GGTGTCCGGT TTGGTATGGT TTTCTTTGGG CGTTTCTTTT 

151 TCTTTGGGTG TTTCTTTGGG CTGCGCCTGT TTTTCGGGTG TTTCTTTTCG 

15 201 GGGTTCGGGA TGGGGGGCGT TTGTGGGCAG TACGGGGGTT TCTTTGAGTG 

251 TGTTTTCAGC TTGTGTTCCG GTGCCGGTTA ACGAATCGGC TGCCCGGGCC 

301 GCATCCGAAG GGCGCGGTTT gACCCGGTTT TTCTTGGGTG CGGCAGGGGA 

3 51 CGGCAGTCCG CTGCCGCTTT CTTCTGTGCC GTCCGGCTGT GCGGGTTCGG 

4 01 ATGAGGCGGC GTGGTGGTGT TCGGGTTGGG CGGCATCTTG TCCGACGGCG 
20 4 51 CCGTTTGGCA GCCAGAATTC GGTTTCGCGG GGGCTGTCGG TGTGTTGCGG 

501 TTCGGTTTGG AGGGTTTTGT CGCCGTTCGG GTTGAATGTG CTGACGATGC 

551 CTACTGCCAA TGCGCCGATG GCGGTGATAC AGATGAGCAA TACGGCGCGT 

601 ATCAGGAGTT TGGGGGTCAG CCTGAAGGGT TTGTTCGGTT TTTTTGCCAT 

651 TTTGATTGTG CTTTTGGGGT GTCGGGCAAT GCCGTCTGAA GGCGGTTCAG 

25 701 ACGGCATTGC CGAGTCAGCG TTGGACGTAG TTTTGGTAGA GGGTAATGAC 

751 TTTTTGTACG CCGAcggTGG TGCTGACTTT TTGGGTAATC TGCGCCTGTT 

801 CTTCGGGGGT GAGGATGCCC ATAACGTAGG TTACATTGCC GTAGGTAATG 

851 ATTTTGACGC GCGCCTGTGT AGCGGGGCTG ATGCCCAGCA GcgtgGCGCG 

901 GACTTTGGAC GTGTTCCAAG TGTCGCCGGC GATGTCGCCC GCAGTGCGCG 

30 951 GCAGGGAGGC GACGGTAATG TAGTTGTATA CGCCTTCGGC GGCCTGTTCG 

1001 GAACGTGCAA TCTGACCGAC GAACTGTTTT TCGCCTTCGG TGGCGACTTG 

1051 TCCGAGCAGC AGCAGGTGGC GGTTGTAGCC GACGACGGAG ATTTGGGGCG 

1101 TGTAGCCTTT GGTTTGGTTG TTTTGGCGCA GGTAGGAACG GGCGGTGGTT 

1151 TCGATACGCA ACGCCATAAC GTtgtCATCG GTTtgcgcgc CGGTGGTTcg 

35 1201 gCGGTCGATG ACGGATTTTG CGCCGACGGC GGCCCCGCCG ACGACTGCGC 

1251 TGAAGCAGCC GCCGAGGGCA AGGCTGAGGA CGGCGGCAAT CAGGGTGCGG 

1301 ACGGTGTGTG GTTTGGGTTT CATCGGGGAC TTCCTTTCTT GGGCGTTTCA 

1351 GACGGCATTG CTTTGCGCCA TGCCGTCTGA 

40 This encodes a protein having amino acid sequence [<SEQ ID 214>] fSEOIDNO: 214) : 

1 MMMPFIMLPW IAGVPAV PGO KRLSR ISLWG LAGVFFGVSG LVW FSLG VSF 

51 SLGVSLGCAC FSGV SFRGSG WG AFVGSTGV SLSVFSACV P VPVNESAARA 

101 ASEGRGLTRF FLGAAGDGSP LPLSSVPSGC AGSDEAAWWC SGWAASCPTA 

151 PFGSQNSVSR GLSVCCGSVW RVLSPFGLNV LTMPTANAPM AVIQMSNTAR 

45 201 IRSLG VSLKG LFGFFAILIV LL GCRAMPSE GGSDGIAESA LDWLVEGND 

251 FLYADGGADF LGNLRLFFGG EDAHNVGYIA VGNDFDARLC SGADAQQRGA 

3 01 DFGRVPSVAG DVARSARQGG DGNWVYAFG GLFGTCNLTD ELFFAFGGDL 
351 SEQQQVAWA DDGDLGR VAF GLWLAQVGT GGGF DTQRHN WIGLRAGGS 

4 01 AVDDGFCADG GPADDCAEAA AEGKAEDGGN QGADGVWFGF HRGLPFLGVS 
50 4 51 DGIALRHAV* 

ORF34ng (SEP ID NO: 214) and ORF34-1 fSEO ID NO: 210) show 90.0% identity in 459 aa 
overlap: 
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10 20 30 40 4 50 
orf 34 - 1 . pep MMMPFIMLPWIAGVPAVPGQNRLSRISLWGLGGVFFGVSGLVWFSLGVS LGCAC 

1 1 1 M 1 1 1 M 1 1 1 1 1 II 1 1 Ml 1 1 1 1 1 1 II M 1 1 1 1 1 1 M 1 1 1 M 1 1 1 Mill 

orf34ng MMMPFIMLPWIAGVPAVPGQKRLSRISLWGLAGVFFGVSGLVWFSLGVSFSLGVSLGCAC 

10 20 30 40 50 60 

60 70 80 90 100 110 

orf 34 - 1 . pep FSGVSFRGSGRGTFVGSTGVSLSVFSACVPASSGCLSVXAVSAGCGLTRFFLGAAGDGSP 

II I Mill I MM Mill MM I IM : - Ml I III I I II II II II 
orf 34ng FSGVSFRGSGWGAFVGSTGVSLSVFSACVPVPVNESAARAASEGRGLTRFFLGAAGDGSP 

70 80 90 100 110 120 

120 130 140 150 160 170 

orf 34 - 1 . pep LPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLSPFGUSTV 

IM I 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M I M 1 1 1 1 1 1 II 1 1 1 1 1 1 II I M llllllllll 

orf34ng LPLSSVPSGCAGSDEAAWWCSGWAASCPTAPFGSQNSVSRGLSVCCGSVWRVLSPFGLNV 

130 140 150 160 170 180 

180 190 200 210 220 230 

orf 34 - 1 . pep LTMPIANAPMAAIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSDGIAESA 

1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M I 1 1 1 1 1 1 M II 1 1 M 1 1 1 1 1 1 1 1 1 1 II M II I M 

orf34ng LTMPTANAPMAVIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSDGIAESA 

190 200 210 220 230 240 

240 250 260 270 280 290 

or f 3 4 - 1 . pep LDWLVEGDDFLYADGGADFLGNLRLFFGGEDAHNVGYVAVGNDFDARLCGGADAQQRGA 

I I M 1 1 1 1 M I II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 M 1 1 M 1 1 1 1 1 1 M M II II 1 1 1 1 

or f 3 4 ng LDWLVEGNDFLYADGGADFLGNLRLFFGGEDAHNVGY I AVGNDFDARLCSGADAQQRGA 

250 260 270 280 290 300 

300 310 320 330 340 350 

orf 34 - 1 . pep DFGCVPSVAGDVAGSARQGGDGNIWHAFGGLFGTCNLTDELFFAFGGDLSEQQQVAWA 

III MINIMI I I I I I I I I I : M : I I I I I 1 I I I I I I I I I I I I I 1 

orf 34ng DFGRVPSVAGDVARSARQGGDGNVWYAFGGLFGTCNLTDELFFAFGGDLSEQQQVAWA 

310 320 330 340 350 360 

360 370 380 390 400 410 

orf 34 - 1 . pep DDGDLGRVAFGLWLAQIGTGGGFDTQRHNWVGLRAGGSAVDGGFRADGGASDYCADAA 

I I I I I I II I : I I I I I I I : II I I I I I I I I II I I I I M 11 = 11 

orf 34ng DDGDLGRVAFGLWLAQVGTGGGFDTQRHNWIGLRAGGSAVDDGFCADGGPADDCAEAA 

370 380 390 400 410 420 



420 430 440 450 

or f 34 - 1 . pep AKGKAENGGNQGADGVRFGFHRVLP FLGVSDG I ALRHAVX 

i M 1 1 IM 1 1 1 1 1 1 1 Mill I II 1 1 1 II 1 1 1 1 1 1 M 

orf34ng AEGKAEDGGNQGADGVWFGFHRGLP FLGVSDG I ALRHAVX 

430 440 450 460 

Based on this analysis, including the presence of a putative leader sequence (double-underlined) 
and several putative transmembrane domains (single-underlined) in the gonococcal protein, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 26 
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The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 215>] (SEP ID 
NO: 215) : 

1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT 

51 CGCCGCCTGC GGATT . CAAA AAGACAGCGC GCCCGCCGCA TCCGCTTCTG 

101 CCGCCGCCGA CAACGGCGCG GCGTAAAAAA GAAATCGTCT TCGGCACGAC 

151 CGTCGGCGAC TTCGGCGATA TGGTCAAAGA ACAAATCCAA GCCGAGCTGG 

201 AGAAAAAAGG CTACACCGTC AAACTGGTCG AGTTTACCGA CTATGTACGC 

251 CCGAATCTGG CATTGGCTGA GGGCGAGTTG 

This corresponds to the amino acid sequence [<SEQ ID 216; ORF4>] (SEP ID NO: 216: ORF4) : 

1 MKTFFKTLSA AALAL I LAAC G . QKDSAPAA SASAAADNGA AKKEIVFGTT 
51 VGDFGDMVKE QIQAELEKKG YTVKLVEFTD YVRPNLALAE GEL 

Further sequence analysis revealed the complete nucleotide sequence [<SEQ ID 217>] (SEP ID 
NP:217) : 



1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT 

51 CGCCGCCTGC GGCGGTCAAA AAGACAGCGC GCCCGCCGCA TCCGCTTCTG 

101 CCGCCGCCGA CAACGGCGCG GCGAAAAAAG AAATCGTCTT CGGCACGACC 

151 GTCGGCGACT TCGGCGATAT GGTCAAAGAA CAAATCCAAG CCGAGCTGGA 

201 GAAAAAAGGC TACACCGTCA AACTGGTCGA GTTTACCGAC TATGTACGCC 

251 CGAATCTGGC ATTGGCTGAG GGCGAGTTGG ACATCAACGT CTTCCAACAC 

301 AAACCCTATC TTGACGACTT CAAAAAAGAA CACAATCTGG ACATCACCGA 

3 51 AGTCTTCCAA GTGCCGACCG CGCCTTTGGG ACTGTACCCG GGCAAGCTGA 

4 01 AATCGCTGGA AGAAGTCAAA GACGGCAGCA CCGTATCCGC GCCCAACGAC 
4 51 CCGTCCAACT TCGCCCGCGT CTTGGTGATG CTCGACGAAC TGGGTTGGAT 
501 CAAACTCAAA GACGGCATCA ATCCGTTGAC CGCATCCAAA GCGGACATCG 
551 CCGAGAACCT GAAAAAC AT C AAAATCGTCG AGCTTGAAGC CGCGCAACTG 
601 CCGCGTAGCC GCGCCGACGT GGATTTTGCC GTCGTCAACG GCAACTACGC 
651 CATAAGCAGC GGCATGAAGC TGACCGAAGC CCTGTTCCAA GAACCGAGCT 
701 TTGCCTATGT CAACTGGTCT GCCGTCAAAA CCGCCGACAA AGACAGCCAA 
751 TGGCTTAAAG ACGTAACCGA GGCCTAXAAC TCCGACGCGT TCAAAGCCTA 
801 CGCGCACAAA CGCTTCGAGG GCTACAAATC CCCTGCCGCA TGGAATGAAG 
851 GCGCAGCCAA ATAA 

This corresponds to the amino acid sequence [<SEQ ID 218; PRF4-1>] (SEP ID NP: 218: PRF4- 
II: 



1 MKTFFKTLSA AALAL I LAA C GGQKDSAPAA SASAAADNGA AKKEIVFGTT 

51 VGDFGDMVKE QIQAELEKKG YTVKLVEFTD YVRPNLALAE GELDINVFQH 

101 KPYLDDFKKE HNLDITEVFQ VPTAPLGLYP GKLKSLEEVK DGSTVSAPND 

151 PSNFARVLVM LDELGWIKLK DGINPLTASK ADIAENLKNI KIVELEAAQL 

201 PRSRADVDFA WNGNYAISS GMKLTEALFQ EPSFAYVNWS AVKTADKDSQ 

2 51 WLKDVTEAYN SDAFKAYAHK RFEGYKSPAA WNEGAAK* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N. meningitidis (strain A) 
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ORF4 (SEP ID NO: 216) shows 93.5% identity over a 93aa overlap with an ORF (ORF4a) (SEP 
ID NP: 220) from strain A of N. meningitidis: 

10 20 30 40 50 59 

orf 4 . pep MKTFFKTLSAAALALILAA CG-QKDSAPAASASAAADNGAAKKEIVFGTTVGDFGDMVKE 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I 
or f 4 a MKTFFKTLSAAALALILAA CGGQKD5APAASASAAADNGAAXKEIVFGTTVGDFGDMVKE 

10 20 30 40 50 60 

60 70 80 90 

orf 4 . pep QIQAELEKKGYTVKLVEFTDYVRPNLALAEGEL 

II lllllllllllll Mill lllllllll 
orf 4a X I QPELEKKG YTVKLVEXTD YVRXNLALAEGELD I NVXQHXX YLDDXKKXHNLD I TXVXQ 

70 80 90 100 110 120 

orf 4a VPTAPLGLYPGKLKSLXXVKXGSTVSAPNDPXXFXRVLVMLDELGXIKLKDXIXXXXXXX 

130 140 150 160 170 180 

The complete length PRF4a nucleotide sequence [<SEQ ID 219>] (SEP ID NP: 219) is: 



1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT 

51 CGCCGCCTGC GGCGGTCAAA AAGATAGCGC GCCCGCCGCA TCCGCTTCTG 

101 CCGCCGCCGA CAACGGCGCG GCGAANAAAG AAATCGTCTT CGGCACGACC 

151 GTCGGCGACT TCGGCGATAT GGTCAAAGAA CANATCCAAC CCGAGCTGGA 

201 GAAAAAAGGC TACACCGTCA AACTGGTCGA GTNTACCGAC TATGTGCGCN 

2 51 CGAATCTGGC ATTGGCTGAG GGCGAGTTGG ACATCAACGT CTTNCAACAC 
301 ANACNCTATC TTGACGACTN CAAAAAANAA CACAATCTGG ACATCACCNN 

3 51 AGTCTTNCAA GTGCCGACCG CGCCTTTGGG ACTGTACCCG GGCAAGCTGA 

4 01 AATCGCTGGA NNAAGTCAAA GANGGCAGCA CCGTATCCGC GCCCAACGAC 
451 CCGTNNNACT TCGNCCGCGT CTTGGTGATG CTCGACGAAC TGGGTTNGAT 
501 CAAACTCAAA GACNGCATCA NNNNGNNGNN NNNANCNANA NNNGANANNN 
551 NNNNANNNNT NNNNNNNNNN NNNNNCNNCG NNNNNNNANN NNNNNNNNNN 
601 NCGNNTNNNN NNGCNNNNNT NNANNNTNNN NNCNNCNNNN NNNNNTNNNN 
651 NANNANNAGC GGCATGAAGC TGACCGAAGC CCTGTTCCAA GAACCGAGCT 
701 TTGCCTATGT CAACTGGTCT GCCGTCAAAA CCGCCGACAA AGACAGCCAA 
751 TGGCTTAAAG ACGTAACCGA GGCCTATAAC TCCGACGCGT TCAAAGCCTA 
801 CGCGCACAAA CGCTTCGAGG GCTACAAATC CCCTGCCGCA TGGAATGAAG 
851 GCGCAGCCAA ATAA 

This is predicted to encode a protein having amino acid sequence [<SEQ ID 220>] (SEP ID NP: 
220) : 



1 MKTFFKTLSA AALALILAA C GGQKDSAPAA SASAAADNGA AXKEIVFGTT 

51 VGDFGDMVKE XI QPELEKKG YTVKLVEXTD YVRXNLALAE GELDINVXQH 

101 XXYLDDXKKX HNLDITXVXQ VPTAPLGLYP GKLKSLXXVK XGSTVSAPND 

151 PXXFXRVLVM LDELGXIKLK DXIXXXXXXX XXXXXXXXXX XXXXXXXXXX 

201 XXXXAXXXXX XXXXXXXXXS GMKLTEALFQ EPSFAYVNWS AVKTADKDSQ 

251 WLKDVTEAYN SDAFKAYAHK RFEGYKSPAA WNEGAAK* 



A leader peptide is underlined. 
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Further analysis of these strain A sequences revealed the complete DNA sequence [<SEQ ID 22 1>] 
rSEOIDNO: 221) : 



1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT 

51 CGCCGCCTGC GGCGGTCAAA AAGATAGCGC GCCCGCCGCA TCCGCTTCTG 

101 CCGCCGCCGA CAACGGCGCG GCGAAAAAAG AAATCGTCTT CGGCACGACC 

151 GTCGGCGACT TCGGCGATAT GGTCAAAGAA CAAATCCAAC CCGAGCTGGA 

201 GAAAAAAGGC TACACCGTCA AACTGGTCGA GTTTACCGAC TATGTGCGCC 

251 CGAATCTGGC ATTGGCTGAG GGCGAGTTGG ACATCAACGT CTTCCAACAC 

3 01 AAACCCTATC TTGACGACTT CAAAAAAGAA CACAATCTGG ACATCACCGA 
351 AGTCTTCCAA GTGCCGACCG CGCCTTTGGG ACTGTACCCG GGCAAGCTGA 

4 01 AATCGCTGGA AGAAGTCAAA GACGGCAGCA CCGTATCCGC GCCCAACGAC 
4 51 CCGTCCAACT TCGCCCGCGT CTTGGTGATG CTCGACGAAC TGGGTTGGAT 
501 CAAACTCAAA GACGGCATCA ATCCGCTGAC CGCATCCAAA GCGGACATTG 
551 CCGAAAACCT GAAAAACATC AAAATCGTCG AGCTTGAAGC CGCGCAACTG 
601 CCGCGTAGCC GCGCCGACGT GGATTTTGCC GTCGTCAACG GCAACTACGC 
651 CATAAGCAGC GGCATGAAGC TGACCGAAGC CCTGTTCCAA GAACCGAGCT 
701 TTGCCTATGT CAACTGGTCT GCCGTCAAAA CCGCCGACAA AGACAGCCAA 
751 TGGCTTAAAG ACGTAACCGA GGCCTATAAC TCCGACGCGT TCAAAGCCTA 
801 CGCGCACAAA CGCTTCGAGG GCTACAAATC CCCTGCCGCA TGGAATGAAG 
851 GCGCAGCCAA ATAA 

This encodes a protein having amino acid sequence [<SEQ ID 222; ORF4a-l>] (SEP ID NO: 222; 
PRF4a-l) : 

1 MKTFFKTLSA AALAL I LAAC GGQKDSAPAA SASAAADNGA AKKEIVFGTT 

51 VGDFGDMVKE QIQPELEKKG YTVKLVEFTD YVRPNLALAE GELDINVFQH 

101 KPYLDDFKKE HNLD I TEVFQ VPTAPLGLYP GKLKSLEEVK DGSTVSAPND 

151 PSNFARVLVM LDELGWIKLK DGINPLTASK AD I AENLKNI KIVELEAAQL 

201 PRSRADVDFA WNGNYAISS GMKLTEALFQ EPSFAYVNWS AVKTADKDSQ 

251 WLKDVTEAYN SDAFKAYAHK RFEGYKSPAA WNEGAAK* 

ORF4a-l (SEP ID NO: 222) and ORF4-1 (SEP ID NO: 218) show 99.7% identity in 287 aa 
overlap: 

10 20 30 40 50 60 

orf4a-l MKTFFKTLSAAALAL I LAACGGQKDSAPAAS AS AAADNGAAKKE I VFGTT VGDFGDMVKE 

IMIIII IMMIIIIIIIMIIIMIII IIIIIIMIIMIIIIIIIIII Mill 

orf 4 - 1 MKTFFKTLSAAALAL I LAACGGQKDSAPAAS AS AAADNGAAKKE I VFGTTVGDFGDMVKE 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 4a- 1 Q I QPELEKKGYTVKLVEFTD YVRPNLALAEGELD INVFQHKP YLDDFKKEHNLD I TEVFQ 

III I M M I 1 1 M II 1 1 1 1 1 1 1 1 1 1 : 1 1 1 I M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II M 1 1 1 1 

orf 4 - 1 Q I QAELEKKGYTVKLVEFTD YVRPNLALAEGELD I NVFQHKP YLDDFKKEHNLD I TEVFQ 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 4a- 1 VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK 

1 1 M I; 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 

orf 4 - 1 VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK 

130 140 150 160 170 180 



orf4a-l 



orf 4-1 



190 200 210 220 230 240 

ADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTEALFQEPSFAYVNWS 

I I I ■ I I I I I I I I I I I I I I I I I I I I ■ I I I M I I I I I I I I I I II II I I i I ■ I I II I M 
AD I AENLKNI KIVELEAAQL PRSRADVDFA WNGNYA I SSGMKLTEALFQE PS FAYVNWS 
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190 200 210 220 230 240 

250 260 270 280 

orf 4a- 1 AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAKX 

1 1 1 1 1 1 II 1 1 1 i 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 Ml I M I i M 

5 orf 4 - 1 AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKS PAAWNEGAAKX 

250 260 270 280 

Homology with an outer membrane protein of Pasteurella haemolitica (accession q08869) (SEP ID 
NO: 1126). 

ORF4 (SEP ID NO: 216) and this outer membrane protein (SEP ID NO: 1126) show 33% aa 
10 identity in 91aa overlap: 

10 20 

lip2 . pasha MNFKKLLGVALVSALALTACKDEKAQAP 

II I ::M II hll :h I 
ORF4 VXTPNPDGRTPCPSFLFETATTSGENMKTFFKTLSAAAL- -ALILAACGFKKTARPPHPL - 

15 110 120 130 140 150 

30 40 50 60 70 80 

1 ip2 . pasha - ATTAKTENKAPLKVGVMTGPEAQMTEVAVKIAKEKYGLDVELVQFTEYTQPNAALHSKD 

: :: I : h :| ::|:: - II I hlhlhhHI II = 
ORF4 LPPPTTARRKKEIVFGTTVGDFGDMVKEQIQAELEKKGYTVKLVEFTDYVRPNLALAEGE 

20 160 170 180 190 200 210 

90 100 110 120 130 140 

lip2 . pasha LDANAFQTVPYLEQEVKDRGYKLAIIGNTLVWPIAAYSKKIKNISELKDGATVAIPNNAS 

I 

0RF4 L 

25 Homology with a predicted ORF from N. gonorrhoeae 

PRF4 (SEP ID NP: 216) shows 93.6% identity over a 94aa overlap with a predicted PRF 
(PRF4.ng) (SEP ID NP: 224) from N. gonorrhoeae: 

10 20 30 

orf4nm.pep MKTFFKTLSAAALALILAACGXQKDSAPAA 

30 | | | | | | | | | : | : | | | | | | | | | I I I I I I I I 

orf 4ng RANAVXTPNPDGRTPCLSFLFETATTSGENMKTFFKTLSTASLALILAACGGQKDSAPAA 

200 210 220 230 240 250 

40 50 60 70 80 89 

orf 4nm . pep S AS A - AADNGAAKKE I VFGTTVGDFGDMVKEQ I QAELEKKGYTVKLVEFTDYVRPNLALA 
35 | | : | : | | | | | | | | | | | | | | | | | | || | | | | || | | | | | | | | | | | | | | | | | | || | I I I I I II 

orf4ng SAAAPSADNGAAKKEIVFGTTVGDFGDMVKEQIQAELEKKGYTVKLVEFTDYVRPNLALA 
260 270 280 290 300 310 
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90 
EGEL 
I I I I 

EGELDINVFQHKPYLDDFKKEHNLDITEAFQVPTAPLGLYPGKLKSLEEVKDGSTVSAPN 
320 330 340 350 360 370 

The complete length ORF4ng nucleotide sequence [<SEQ ID 223>] (SEP ID NO: 223) was 
predicted to encode a protein having amino acid sequence [<SEQ ID 224>] (SEP ID NO: 224) : 



1 MKTFFKTLST ASLAL ILAAC GGQKDSAPAA SAAAPSADNG AAKKEIVFGT 

51 TVGDFGDMVK EQIQAELEKK GYTVKLVEFT DYVRPNLALA EGELDINVFQ 

101 HKPYLDDFKK EHNLDITEAF QVPTAPLGLY PGKLKSLEEV KDGSTVSAPN 

151 DPSNFARALV MLNELGWIKL KDGINPLTAS, KADIAENLKN IKIVELEAAQ 

201 LPRSRADVDF AWNGNYAIS SGMKLTEALF QEPSFAYVNW SAVKTADKDS 

251 QWLKDVTEAY NSDAFKAYAH KRFEGYKYPA AWNEGAAK* 

Further analysis revealed the complete length ORF4ng DNA sequence [<SEQ ID 225>] (SEP ID 
NP: 225) to be: 



1 atgAAAACCT TCTTCAAAAC cctttccgcc gccgcaCTCG CGCTCATCCT 

51 CGCAGCCTGc ggCggtcaAA AAGACAGCGC GCCCgcagcc tctgcCGCCG 

101 CCCCTTCTGC CGATAACGgc gCgGCGAAAA AAGAAAtcgt ctTCGGCACG 

151 Accgtgggcg acttcggcgA TAtggTCAAA GAACAAATCC AagcCGAgct 

201 gGAGAAAAAA GgctACACcg tcAAattggt cgaatttacc gactatgtGC 

251 gCCCGAATCT GGCATTGGCG GAGGGCGAGT TGGACATCAA CGTCTTCCAA 

3 01 CACAAACCCT ATCTTGACGA TTTCAAAAAA GAACACAACC TGGACATCAC 
351 CGAAGCCTTC CAAGTGCCGA CCGCGCCTTT GGGACTGTAT CCGGGCAAAC 

4 01 TGAAATCGCT GGAAGAAGTC AAAGACGGCA GCACCGTATC CGCGCCCAac 
4 51 gACccgTCCA ACTTCGCACG CGCCTTGGTG ATGCTGAACG AACTGGGTTG 
501 GATCAAACTC AAAGACGGCA TCAATCCGCT GACCGCATCC AAAGCCGACA 
551 TCGCGGAAAA CCTGAAAAAC ATCAAAATCG TCGAGCTTGA AGCCGCACAA 
601 CTGCCGCGCA GCCGCGCCGA CGTGGATTTT GCCGTCGTCA ACGGCAACTA 
651 CGCCATAAGC AGCGGCATGA AGCTGACCGA AGCCCTGTTC CAAGAGCCGA 
701 GCTTTGCCTA TGTCAACTGG TCTGCCgtcA AAACCGCCGA CAAAGACAGC 
751 CAATGGCTTA AAGACGTAAC CGAGGCCTAT AACTCCGACG CGTTCAAAGC 
801 CTACGCGCAC AAACGCTTCG AGGGCTACAA ATACCCTGCC GCATGGAATG 
851 AAGGCGCAGC CAAATAA 

This encodes a protein having amino acid sequence [<SEQ ID 226; PRF4ng-l>] (SEP ID NP: 
226: PRF4ng-l) : 

1 MKTFFKTLSA AALALILAA C GGQKDSAPAA SAAAPSADNG AAKKEIVFGT 

51 TVGDFGDMVK EQIQAELEKK GYTVKLVEFT DYVRPNLALA EGELDINVFQ 

101 HKPYLDDFKK EHNLDITEAF QVPTAPLGLY PGKLKSLEEV KDGSTVSAPN 

151 DPSNFARALV MLNELGWIKL KDGINPLTAS KADIAENLKN IKIVELEAAQ 

201 LPRSRADVDF AWNGNYAIS SGMKLTEALF QEPSFAYVNW SAVKTADKDS 

251 QWLKDVTEAY NSDAFKAYAH KRFEGYKYPA AWNEGAAK* 

This shows 97.6% identity in 288 aa overlap with PRF4-1 (SEP ID NP: 218) : 



10 20 30 40 50 59 

orf 4 - 1 . pep MKTFFKTLS AAALAL I LAACGGQKDS APAAS AS A - AADNGAAKKE I VFGT TVGDFGDMVK 

II I II II III II II III II II I II II II I II hi :|llllnllMII IMIIMI 



orf 4nm.pep 
orf 4ng 
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orf 4ng-l 



MKTFFKTLSAAALALILAACGGQKDSAPAASAAAPSADNGAAKKEIVFGTTVGDFGDMVK 
10 20 30 40 50 60 



60 70 80 90 100 110 119 

or f 4 - 1 . pep EQ I QAELEKKGYTVKLVE FTDYTOPNLALAEGELD INVFQHKP YLDDFKKEHNLD ITEVF 

M 1 I I I I I I I M I I II I I M i I I ; I i I 11 : I : M I M I M I i 1 : 1 

orf4ng-l EQIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEAF 

70 80 90 100 110 120 

120 13 0 14 0 150 160 170 17 9 

orf 4 - 1 . pep QVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTAS 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIhlllhlMIIIMIIIIIIIII 

orf4ng-l QVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARALVMLNELGWIKLKDGINPLTAS 

130 140 150 160 170 180 

180 190 200 210 220 230 239 

orf 4 - 1 . pep KADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQEPSFAYVNW 

IIIIIIIMIIIIII I IIMIIMII hlllllllllllllllllllMIMIIII 

orf4ng-l KADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQEPSFAYVNW 

190 200 210 220 230 240 

240 250 260 270 280 

or f 4 - 1 . pep SAVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAKX 

llllllllll I MM M.I I IIMIIII IIIMIMI 

orf 4ng-l SAVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKYPAAWNEGAAKX 

250 260 270 280 

In addition, orf4ng-l (SEP ID NO: 226) shows significant homology with an outer membrane 
protein (SEP ID NO: 1126) from the database: 



ID LIP2_PASHA STANDARD; PRT; 276 AA. 

AC 
DT 
DT 
DT 

DE 28.2 KD OUTER MEMBRANE PROTEIN PRECURSOR . . . ; 
SCORES Initl: 279 Initn: 416 Opt: 494 

Smith-Waterman score: 494; 36.0% identity in 275 aa overlap 



LIP2_PASHA 

Q08869; 

01-NOV-1995 

01-NOV-1995 

01-NOV-1995 



{REL, 32, CREATED) 

(REL. 32, LAST SEQUENCE UPDATE) 

(REL. 32, LAST ANNOTATION UPDATE) 



10 20 30 ' 40 50 

orf 4ng- 1 . pep MKTFFKTLSAAAL- - ALILAACGGQKDSAPAASAAAPSADNGAAKKEIVFGTTVGDFGDM 
II I s:|| II hll =| :|||::| :::| | | |: :| ::| 

lip2_pasha MNFKKLLGVALVSALALTACKDEKAQAPATTA KTENKAPLK VGVMTGPEAQM 

10 20 30 40 50 



60 70 80 90 100 110 

orf 4ng- 1 . pep VKEQ I QAELEKKGYTVKLVE FTDYVRPNLALAEGELD I NVFQHKP YLDDFKKEHNLD I TE 

:: - II I hlhlhl-ll II HI hll 111 = = h = = 
lip2_pasha TEVAVKIAKEKYGLDVELVQFTEYTQPNAALHSKDLDANAFQTVPYLEQEVKDRGYKLAI 

60 70 80 90 100 110 



120 130 140 150 160 170 

orf 4ng- 1 . pep AFQVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARALVMLNELGWIKLKDGINPLT 

:: : h= I hi- hllhlh Ih II llll-h I -MM I : 
lip2_pasha IGNTLVWPIAAYSKKIKNISELKDGATVAIPNNASNTARALLLLQAHGLLKLKDPKN-VF 
120 130 140 150 160 170 
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180 190 200 210 220 230 

orf 4ng- 1 . pep ASKADI AENLKNI KI VELEAAQLPRSRADVDFAWNGNYAI SSGMKLTE - -ALFQEPSFA 

|:: II II Mlllh ::: I I ||::|h|::|| =:|:= = = = : 
lip2jpasha ATENDIIENPKNIKIVQADTSLLTRMLDDVELAVINNTYAGQAGLSPDKDGIIVESKDSP 
5 180 190 200 210 220 230 

240 250 260 270 280 289 

orf 4ng- 1 . pep YVNWSAVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKYPAAWNEGAAKX 

III = : =lh h : = : = ::= I I hi 

1 ip2_pasha . YVNLWSREDNKDDPRLQTFVKSFQTEEVFQEALKLFNGGWKGW 

10 240 250 260 270 

Based on this analysis, including the homology with the outer membrane protein of Pasteurella 
haemolitica, and on the presence of a putative prokaryotic membrane lipoprotein lipid attachment 
site in the gonococcal protein, it was predicted that these proteins from N. meningitidis and 
N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
15 raising antibodies. 

ORF4-1 fSEO ID NO: 218) (30kDa) was cloned in pET and pGex vectors and expressed in E.coli, 
as described above. The products of protein expression and purification were analyzed by SDS- 
PAGE. Figures 8A and 8B show, repsectively, the results of affinity purification of the His-fusion 
and GST-fusion proteins. Purified His-fusion protein was used to immunise mice, whose sera were 
20 used for ELISA (positive result), Western blot (Figure 8C), FACS analysis (Figure 8D), and a 
bactericidal assay (Figure 8E). These experiments confirm that ORF4-1 (SEQ ID NO: 218) is a 
surface-exposed protein, and that it is a useful immunogen. 

Figure 8F shows plots of hydrophilicity, antigenic index, and AMPHI regions for ORF4-1 (SEQ ID 
NO: 218) . 

25 

Example 27 

The following partial DNA sequence was identified in N. meningitidis [<SEQ LD 227>] (SEP ID 
NO: 227) : 

1 CCTCGTCGTC CTCGGCATGC TCCAGTTTCA AGGGGCGATT TACTCCAAGG 
30 51 CGGTGGAACG TATGCTCGGC ACGGTCATCG -GGCTGGGCGC GGGTTTGGGC 

101 GTTTTATGGC TGAACCAGCA TTATTTCCAC GGCAACCTCC TCTTCTACCT 
151 CACCGTCGGC ACGGCAAGCG CACTGGCCGG CTGGGCGGCG GTCGGCAAAA 
201 ACGGCTACGT CCCTmTGCTG GCAGGGCTGA CGATGTGTAT GCTCATCGGC 
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251 GACAACGGCA GCGAATGGCT CGACAGCGGA CTCATGCGCG CCATGAACGT 

301 CCTCATCGGC GyGGCCATCG CCATCGCCGC CGCCAAACTG CTGCCGCTGA 

351 AATCCACACT GATGTGGCGT TTCATGCTTG CCGACAACCT GGCCGACTGC 

401 AGCAAAATGA TTGCCGAAAT CAGCAACGGC AGGCGCATGA CCCGCGAACG 

451 CCTCGAGGAG AACATGGCGA AAATGCGCCA AATCAACGCA CGCATGGTCA 

501 AAAGCCGCAG CCATCTCGCC GCCACATCGG GCGAAAGCTG CATCAGCCCC 

551 GCCATGATGG AAGCCATGCA GCACGCCCAC CGTAAAATCG TCAACACCAC 

601 CGAGCTGCTC CTGACCACCG CCGCCAAGCT GCAATCTCCC AAACTCAACG 

651 GCAGCGAAAT CCGGCTGCTT GACCGCCACT TCACACTGCT CCAAAC . . . . 

701 GC AGACACGCCC GCCGCATCCG 

751 CATCGACACC GCCATCAACC CCGAACTGGA AGCCCTCGCC GAACACCTCC 

801 ACTACCAATG GCAGGGCTTC CTCTGGCTCA GCACCGATAT GCGTCAGGAA 

851 ATTTCCGCCC TCGTCATCCT GCTGCAACGC ACCCGCCGCA AATGGCTGGA 

901 TGCCCACGAA CGCCAACACC TGCGCCAAAG CCTGCTTGA 

This corresponds to the amino acid sequence [<SEQ ID 228; ORF8>] (SEP ID NO: 228; PRF8) : 

1 PRRP RHAPVSRGDL LQGGGTYARH GHRAGRGFGR FMAEPALFPR 

51 QPPLLPHRRH GKRTGRLGGG RQKRLRPXAG RADDVYAHRR QRQRMARQRT 

101 HARHERPHRR GHRHRRRQTA AAEIHTDVAF HACRQPGRLQ QNDCRNQQRQ 

151 AHDPRTPRGE HGENAPNQRT HGQKPQPSRR HIGRKLHQPR HDGSHAARPP 

201 XNRQHHRAAP DHRRQAAISQ TQRQRNPAAX PPLHTAPN Q 

251 TRPPHPHRHR HQPRTGSPRR TPPLPMAGLP LAQHRYASGN FRPRHPAATH 

3 01 PPQMAGCPRT PTPAPKPA* 

Computer analysis of this amino acid sequence gave the following results: 
Sequence motifs 

ORF8 (SEP ID NO: 228) is proline-rich and has a distribution of proline residues consistent with a 
surface localization. Furthermore the presence of an RGD motif may indicate a possible role in 
bacterial adhesion events. 

Homology with a predicted ORF from N. gonorrhoeae 

ORF8 (SEP ID NO: 228) shows 86.5% identity over a 312aa overlap with a predicted ORF 
(ORF8.ng) (SEP ID NP: 230) from N. gonorrhoeae: 



orf 8ng 


1 


orf 8 .pep 


1 


orf 8ng 


51 


orf 8 .pep 


45 


orf 8ng 


101 


orf 8 .pep 


95 



MDRDDRLRRPRHAPVPRRDLLQRGGTYARYGHRAGRGFGRFMAEPALFPR 5 0 

MINIM I 1 1 1 1 MIMMIIMIIIII MIMMMI 

PRRPRHAPVSRGDLLQGGGTYARHGHRAGRGFGRFMAEPALFPR 44 

QPPLLPDHRHGKRTGRLGGGRQKRLRPYVGGADDVHAHRRQRQRMARQRP 100 

1 1 1 1 1 1 M 1 1 1 I M I 1 1 1 1 1 1 M I 1 1 1 h 1 1 1 1 1 1 1 1 1 1 1 1 1 

QPPLLPHRRHGKRTGRLGGGRQKRLRPXAGRADDVYAHRRQRQRMARQRT 94 
DARDERPHRRRHRHCRRQTAAAE IHTDVAFHACRQPGRLQQNDCRNQQRQ 150 

II MIMI III 1 1 i 1 1 1 1 1 1 1 M I 1 1 1 1 MIMIMIM 

HARHERPHRRGHRHRRRQTAAAEIHTDVAFHACRQPGRMQQNDCRNQQRQ 144 
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orf8ng 151 AYDARTFGAEYGQNAPNQRTHGQKPQPPRRHIGRKPHQPLHDGSHAARPP 200 

hi II M : 1 1 1 M 1 1 1 1 1 M I i I II II II III MJI.Ii 

orf8.pep 145 AHDPRTPRGEHGENAPNQRTHGQKPQPSRRHIGRKLHQPRHDGSHAARPP 194 
orf8ng 201 QNRQHHRAAPDHRRQAAISQTQRQRNPAARPPLHTAPNRPATNRRPHQRQ 250 

5 1 1 1 1 1 1 1 1 II I M II 1 1 1 1 1 1 1 1 1 1 1 II MINIM I 

orf8.pep 195 XNRQHHRAAPDHRRQAA I S QTQRQRNP AAX P PLHT APN Q 244 

orf8ng 251 TRPPHPHRHRHQPRTGSPRRTPPLPMAGFPLAQHQYASGNFRPRHPPATH 300 

M II I I I I I I I I I I I I II I II I I I I I I I I I I I - I I I I : I I I - 1 I III 
orf8.pep 245 TRPPHPHRHRHQPRTGSPRRTPPLPMAGLPLAQHRYASGNFRPRHPAATH 294 

10 orf8ng 301 PPQMAGCPRTPTPAPKPA* 319 

I I I I I I I M II I I I II I I 
orf8.pep 295 PPQMAGCPRTPTPAPKPA* 313 

The complete length ORF8ng nucleotide sequence [<SEQ ID 229>] (SEP ID NO: 229) is 
15 predicted to encode a protein having amino acid sequence [<SEQ ID 230>] (SEP ID NO: 230) : 

1 MDRDDRLRRP RHAPVPRRDL LQRGGTYARY GHRAGRGFGR FMAEPALFPR 

51 QPPLLPDHRH GKRTGRLGGG RQKRLRPYVG GADDVHAHRR QRQRMARQRP 

101 DARDERPHRR RHRHCRRQTA AAEIHTDVAF HACRQPGRLQ QNDCRNQQRQ 

151 AYDARTFGAE YGQNAPNQRT HGQKPQPPRR HIGRKPHQPL HDGSHAARPP 

20 2 01 QNRQHHRAAP DHRRQAAISQ TQRQRNPAAR PPLHTAPNRP ATNRRPHQRQ 

2 51 TRPPHPHRHR HQPRTGSPRR TPPLPMAGFP LAQHQYASGN FRPRHPPATH 

301 PPQMAGCPRT PTPAPKPA* 

Based on the sequence motifs in these proteins, it is predicted that the proteins from N. meningitidis 
25 and N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
raising antibodies. 

Example 28 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 23 1>] (SEP ID 
NP: 231) : 

30 1 . . GAAATCAGCC TGCGGTCCGA CNACAGGCCG GTTTCCGTGN CGAAGCGGCG 

51 GGATTCGGAA CGTTTTCTGC TGTTGGACGG CGGCAACAGC CGGCTCAAGT 

101 GGGCGTGGGT GGAAAACGGC ACGTTCGCAA CCGTCGGTAG CGCGCCGTAC 

151 CGCGATTTGT CGCCTTTGGG CGCGGAGTGG GCGGAAAAGG CGGATGGAAA 

201 TGTCCGCATC GTCGGTTGCG CTGTGTGCGG AGAATTCAAA AAGGCACAAG 

35 251 TGCAGGAACA GCTCGCCCGA AAAATCGAGT GGCTGCCGTC TTCCGCACAG 

301 GCTTT . GGCA TACGCAACCA CTACCGCCAC CCCGAAGAAC ACGGTTCCGA 

351 CCGCTGGTTC AACGCCTTGG GCAGCCGCCG CTTCAGCCGC AACGCCTGCG 

4 01 TCGTCGTCAG TTGCGGCACG GCGGTAACGG TTGACGCGCT CACCGATGAC 

4 51 GGACATTATC TCGGAGA.GG AACCATCATG CCCGGTTTCC ACCTGATGAA 

40 501 AGAATCGCTC GCCGTCCGAA CCGCCAACCT CAACCGGCAC GCCGGTAAGC 

551 GTTATCCTTT CCCGACCGG . . 
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This corresponds to the amino acid sequence [<SEQ ID 232; ORF61>] (SEP ID NO: 232; 
ORF61) : 



1 . .EISLRSDXRP VSVXKRRDSE RFLLLDGGNS RLKWAWVENG TFATVGSAPY 

51 RDLSPLGAEW AEKADGNVRI VGCAVCGEFK KAQVQEQLAR KIEWLPSSAQ 

101 AXGIRNHYRH PEEHGSDRWF NALGSRRFSR NACVWSCGT AVTVDALTDD 

151 GHYLGXGTIM PGFHLMKESL AVRTANLNRH AGKRYPFPT . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 233>] (SEP ID NO: 233) : 



1 ATGACGGTTT TGAAGCTTTC GCACTGGCGG GTGTTGGCGG AGCTTGCCGA 

51 CGGTTTGCCG CAACACGTCT CGCAACTGGC GCGTATGGCG GATATGAAGC 

101 CGCAGCAGCT CAACGGTTTT TGGCAGCAGA TGCCGGCGCA CATACGCGGG 

151 CTGTTGCGCC AACACGACGG CTATTGGCGG CTGGTGCGCC CATTGGCGGT 

201 TTTCGATGCC GAAGGTTTGC GCGAGCTGGG GGAAAGGTCG GGTTTTCAGA 

251 CGGCATTGAA GCACGAGTGC GCGTCCAGCA ACGACGAGAT ACTGGAATTG 

301 GCGCGGATTG CGCCGGACAA GGCGCACAAA ACCATATGCG TGACCCACCT 

351 GCAAAGTAAG GGCAGGGGGC GGCAGGGGCG GAAGTGGTCG CACCGTTTGG 

4 01 GCGAGTGTCT GATGTTCAGT TTTGGCTGGG TGTTTGACCG GCCGCAGTAT 

451 GAGTTGGGTT CGCTGTCGCC TGTTGCGGCA GTGGCGTGTC GGCGCGCCTT 

501 GTCGCGTTTA GGTTTGGATG TGCAGATTAA GTGGCCCAAT GATTTGGTTG 

551 TCGGACGCGA CAAATTGGGC GGCATTCTGA TTGAAACGGT CAGGACGGGC 

601 GGCAAAACGG TTGCCGTGGT CGGTATCGGC ATCAATTTTG TCCTGCCCAA 

651 GGAAGTAGAA AATGCCGCTT CCGTGCAATC GCTGTTTCAG ACGGCATCGC 

701 GGCGGGGCAA TGCCGATGCC GCCGTGCTGC TGGAAACGCT GTTGGTGGAA 

751 CTGGACGCGG TGTTGTTGCA ATATGCGCGG GACGGATTTG CGCCTTTTGT 

801 GGCGGAATAT CAGGCTGCCA ACCGCGACCA CGGCAAGGCG GTATTGCTGT 

851 TGCGCGACGG CGAAACCGTG TTCGAAGGCA CGGTTAAAGG CGTGGACGGA 

901 CAAGGCGTTT TGCACTTGGA AACGGCAGAG GGCAAACAGA CGGTCGTCAG 

951 CGGCGAAATC AGCCTGCGGT CCGACGACAG GCCGGTTTCC GTGCCGAAGC 

1001 GGCGGGATTC GGAACGTTTT CTGCTGTTGG ACGGCGGCAA CAGCCGGCTC 

1051 AAGTGGGCGT GGGTGGAAAA CGGCACGTTC GCAACCGTCG GTAGCGCGCC 

1101 GTACCGCGAT TTGTCGCCTT TGGGCGCGGA GTGGGCGGAA AAGGCGGATG 

1151 GAAATGTCCG CATCGTCGGT TGCGCTGTGT GCGGAGAATT CAAAAAGGCA 

1201 CAAGTGCAGG AACAGCTCGC CCGAAAAATC GAGTGGCTGC CGTCTTCCGC 

1251 ACAGGCTTTG GGCATACGCA ACCACTACCG CCACCCCGAA GAACACGGTT 

13 01 CCGACCGCTG GTTCAACGCC TTGGGCAGCC GCCGCTTCAG CCGCAACGCC 
1351 TGCGTCGTCG TCAGTTGCGG CACGGCGGTA ACGGTTGACG CGCTCACCGA 

14 01 TGACGGACAT TATCTCGGGG GAACCATCAT GCCCGGTTTC CACCTGATGA 
1451 AAGAATCGCT CGCCGTCCGA ACCGCCAACC TCAACCGGCA CGCCGGTAAG 
1501 CGTTATCCTT TCCCGACCAC AACGGGCAAT GCCGTCGCCA GCGGCATGAT 
1551 GGATGCGGTT TGCGGCTCGG TTATGATGAT GCACGGGCGT TTGAAAGAAA 
1601 AAACCGGGGC GGGCAAGCCT GTCGATGTCA TCATTACCGG CGGCGGCGCG 
1651 GCAAAAGTTG CCGAAGCCCT GCCGCCTGCA TTTTTGGCGG AAAATACCGT 
1701 GCGCGTGGCG GACAACCTCG TCATTTACGG GTTGTTGAAC ATGATTGCCG 
1751 CCGAAGGCAG GGAATATGAA CATATTTAA 

This corresponds to the amino acid sequence [<SEQ ID 234; ORF61-l>] (SEP ID NO: 234; 
PRF6M) : 



1 MTVLKLSHWR VLAELADGLP QHVSQLARMA DMKPQQLNGF WQQMPAHIRG 

51 LLRQHDGYWR LVRPLAVFDA EGLRELGERS GFQTALKHEC ASSNDEILEL 

101 ARIAPDKAHK TICVTHLQSK GRGRQGRKWS HRLGECLMFS FGWVFDRPQY 

151 ELGSLSPVAA VACRRALSRL GLDVQIKWPN DLWGRDKLG GILIETVRTG 

201 GKTVAWGIG INFVLPKEVE NAASVQSLFQ TASRRGNADA AVLLETLLVE 



CHIR-0160 (356.001) 



-222- 



PATENT 



251 LDAVLLQYAR DGFAPFVAEY QAANRDHGKA VLLLRDGETV FEGTVKGVDG 

301 QGVLHLETAE GKQTWSGEI SLRSDDRPVS VPKRRDSERF LLLDGGNSRL 

351 KWAWVENGTF ATVGSAPYRD LSPLGAEWAE KADGNVRIVG CAVCGEFKKA 

401 QVQEQLARKI EWLPSSAQAL GIRNHYRHPE EHGSDRWFNA LGSRRFSRNA 

5 4 51 CVWSCGTAY TVDALT DDGH YLGGTIMPGF HLMKESLAVR TANLNRHAGK 

501 RYPFPTTTGN AVASGMMDAV CGSVMMMHGR LKEKTGAGKP VDVIITGGGA 

551 AKVAEALPPA FLAENTVRVA DNLVIYGLLN MIAAEGREYE HI* 

Figure 9 shows plots of hydrophilicity, antigenic index, and AMPHI regions for PRF61-1 (SEP ID 
10 NO: 234) . Further computer analysis of this amino acid sequence gave the following results: 



Homology with the baf protein of B. pertussis (accession number U12020) (SEP ID NO: 1 127). 
ORF61 fSEO ID NO: 232) and baf protein show 33% aa identity in 166aa overlap: 



orf 61 


23 


LLLDGGNSRLKWAWVE-NGTFATVGSAPYR DLS PLGAEWAEKADGNVR I VGCAVCG 


77 






+L+D GNSRLK W + + A AP DL LG A R +G V G 




baf 


3 


ILIDSGNSRLKVGWFDPDAPQAAREPAPVAFDNLDLDALGRWLATLPRRPQRALGVNVAG 


62 


orf 61 


78 


EFKKAQVQEQLAR KIEWLPSSAQAXGIRNHYRHPEEHGSDRW FNALGSRRFSRN 


131 






+ + L I WL + A G+RN YR+P++ G+DRW L + 




baf 


63 


IARGEAIAATLRAGGCDIRWLRAQPLAMGLRNGYRNPDQLGADRWACMVGVLARQPSVHP 


122 


orf 61 


132 


ACVWS CGT AVTVDALTDDGH YLGXGT I M PGFHLM KE S LAVRTANL 177 








+V S GTA T+D + D +GGI+PG +M+ +LA TA+L 




baf 


123 


PLLVAS FGTATTLDT I GPDNVFPG- GL I LPGPAMMRGALA YGTAHL 167 





Homology with a predicted ORF from N.meningitidis (strain A) 

ORF61 (SEP ID NO: 232) shows 97.4% identity over a 189aa overlap with an ORF (ORF61a) 
(SEP ID NO: 236) from strain A of N. meningitidis: 



25 10 20 30 

orf61 pep E I SLRSDXRPVS VXKRRDS ERFLLLDGGNS 

MINI Mill IMIMI IMMM 

orf 61a TVFEGTVKGVDGQGVLHLETAEGKQTWSGE I SLRSDDRPVSVPKRRDS ERFLLLDGGNS 

290 300 310 320 330 340 

30 40 50 60 70 80 90 

orf 61 . pep RLKWAWVENGTFATVGSAPYRDLS PLGAEWAEKADGNVR I VGCAVCGEFKKAQVQEQLAR 

MM MINI IIIIIMIIIilllllllllMIIIIIMIIIIMMI llllll I 

orf 61a RLKWAWVENGTFATVGSAPYRDLS PLGAEWAEKVDGNVR I VGCAVCGEFKKAQVQEQLAR 

350 360 370 380 390 400 

'35 100 110 120 130 140 150 

orf 61 .pep KI EWLPS SAQAXG I RNHYRHPEEHGSDRWFNALGSRRFSRN ACWVS CGTAVTVDALT DD 

Illllllllll 1 1 1 1 1 1 1 M I M II 1 1 1 1 1 1 1 M I ! I i M 1 1 1 M I M I 1 1 1 1 1 1 1 1 

orf 61a KIEWLPSSAOALGIRNHYRHPEEHGSDRWFNALGSRRFSRN ACVWSCGTAVTVDALT DD 
410 420 430 440 450 460 

40 160 170 180 189 

orf 61 . pep GHYLGXGT I M PGFHLM KESLAVRTANLNRHAGKRYPFPT 
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orf 61a GHYLG-GTIMPGFHLMKESI^VRTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMM 
470 480 490 500 510 520 

or f 6 la HGRLKEKTGAGKPVDVI I TGGGAAKVAEALPPAFLAENTVRVADNLV I HGLLNL I AAEGG 

530 540 550 560 570 580 

The complete length ORF6 la nucleotide sequence [<SEQ ID 235>] (SEP ID NO: 235) is 



1 ATGACGGTTT TGAAGCCTTC GCACTGGCGG GTGTTGGCGG AGCTTGCCGA 

51 CGGTTTGCCG CAACACGTCT CGCAACTGGC GCGTATGGCG GATATGAAGC 

101 CGCAGCAGCT CAACGGTTTT TGGCAGCAGA TGCCGGCGCA CATACGCGGG 

151 CTGTTGCGCC AACACGACGG CTATTGGCGG CTGGTGCGCC CATTGGCGGT 

201 TTTCGATGCC GAAGGTTTGC GCGAGCTGGG GGAAAGGTCG GGTTTTCAGA 

251 CGGCATTGAA GCACGAGTGC GCGTCCAGCA ACGACGAGAT ACTGGAATTG 

301 GCGCGGATTG CGCCGGACAA GGCGCACAAA ACCATATGTG TGACCCACCT 

351 GCAAAGTAAG GGCAGGGGGC GGCAGGGGCG GAAGTGGTCG CACCGTTTGG 

4 01 GCGAGTGTCT GATGTTCAGT TTTGGCTGGG TGTTTGACCG GCCGCAGTAT 

4 51 GAGTTGGGTT CGCTGTCGCC TGTTGCGGCA GTGGCGTGCC GGCGCGCCTT 

501 GTCGCGTTTG GGTTTGAAAA CGCAAATCAA GTGGCCAAAC GATTTGGTCG 

551 TCGGACGCGA CAAATTGGGC GGCATTCTGA TTGAAACGGT CAGGACGGGC 

601 GGCAAAACGG TTGCCGTGGT CGGTATCGGC ATCAATTTCG TGCTGCCCAA 

651 GGAAGTGGAA AACGCCGCTT CCGTGCAATC GCTGTTTCAG ACGGCATCGC 

701 GGCGGGGAAA TGCCGATGCC GCCGTGTTGC TGGAAACGCT GTTGGCGGAA 

751 CTTGATGCGG TGTTGTTGCA ATATGCGCGG GACGGATTTG CGCCTTTTGT 

801 " GGCGGAATAT CAGGCTGCCA ACCGCGACCA CGGCAAGGCG GTATTGCTGT 

851 TGCGCGACGG CGAAACCGTG TTCGAAGGCA CGGTTAAAGG CGTGGACGGA 

901 CAAGGCGTTC TGCACTTGGA AACGGCAGAG GGCAAACAGA CGGTCGTCAG 

951 CGGCGAAATC AGCCTGCGGT CCGACGACAG GCCGGTTTCC GTGCCGAAGC 

1001 GGCGGGATTC GGAACGTTTT CTGCTGTTGG ACGGCGGCAA CAGCCGGCTC 

1051 AAGTGGGCGT GGGTGGAAAA CGGCACGTTC GCAACCGTCG GTAGCGCGCC 

1101 GTACCGCGAT TTGTCGCCTT TGGGCGCGGA GTGGGCGGAA AAGGTGGATG 

1151 GAAATGTCCG CATCGTCGGT TGCGCCGTGT GCGGAGAATT CAAAAAGGCA 

1201 CAAGTGCAGG AACAGCTCGC CCGAAAAATC GAGTGGCTGC CGTCTTCCGC 

12 51 ACAGGCTTTG GGCATACGCA ACCACTACCG CCACCCCGAA GAACACGGTT 
1301 CCGACCGCTG GTTCAACGCC TTGGGCAGCC GCCGCTTCAG CCGCAACGCC 

13 51 TGCGTCGTCG TCAGTTGCGG CACGGCGGTA ACGGTTGACG CGCTCACCGA 

14 01 TGACGGACAT TATCTCGGGG GAACCATCAT GCCCGGTTTC CACCTGATGA 
14 51 AAGAATCGCT CGCCGTCCGA ACCGCCAACC TCAACCGGCA CGCCGGTAAG 
1501 CGTTATCCTT TCCCGACCAC AACGGGCAAT GCCGTCGCCA GCGGCATGAT 
1551 GGATGCGGTT TGCGGCTCGG TTATGATGAT GCACGGGCGT TTGAAAGAAA 
1601 AAACCGGGGC GGGCAAGCCT GTCGATGTCA TCATTACCGG CGGCGGCGCG 
1651 GCAAAAGTTG CCGAAGCCCT GCCGCCTGCA TTTTTGGCGG AAAATACCGT 
1701 GCGCGTGGCG GACAACCTCG TCATTCACGG GCTGCTGAAC CTGATTGCCG 
1751 CCGAAGGCGG GGAATCGGAA CATACTTAA 

This encodes a protein having amino acid sequence [<SEQ ID 236>] (SEP ID NO: 236) : 



1 MTVLKPSHWR VLAELADGLP QHVSQLARMA DMKPQQLNGF WQQMPAHIRG 

51 LLRQHDGYWR LVRPLAVFDA EGLRELGERS GFQTALKHEC ASSNDEILEL 

101 ARIAPDKAHK TICVTHLQSK GRGRQGRKWS HRLGECLMFS FGWVFDRPQY 

151 ELGSLSPVAA VACRRALSRL GLKTQIKWPN DLWGRDKLG GILIETVRTG 

201 GKTVAWGIG INFVLPKEVE NAASVQSLFQ TASRRGNADA AVLLETLLAE 

2 51 LDAVLLQYAR DGFAPFVAEY QAANRDHGKA VLLLRDGETV FEGTVKGVDG 
301 QGVLHLETAE GKQTWSGEI SLRSDDRPVS VPKRRDSERF LLLDGGNSRL 

3 51 KWAWVENGTF ATVGSAPYRD LSPLGAEWAE KVDGNVRIVG CAVCGEFKKA 

4 01 QVQEQLARKI EWLPSSAQAL GIRNHYRHPE EHGSDRWFNA LGSRRFSRNA 
4 51 CVWSCGTAV TVDALT DDGH YLGGTIMPGF HLMKESLAVR TANLNRHAGK 
501 RYPFPTTTGN AVASGMMDAV CGSVMMMHGR LKEKTGAGKP VDVIITGGGA 
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551 AKVAEALPPA FLAENTVRVA DNLVIHGLLN LIAAEGGESE HT* 

ORF61a (SEP ID NO: 236) and ORF61-1 (SEP ID NO: 234) show 98.5% identity in 591 aa 
overlap: 

5 10 20 30 40 50 60 

or f 61a. pep MTVLKPSHWRVLAELADGLPQHVSQLARMADMKPQQLNGFWQQMPAHIRGLLR 

Mill 1 1 1 1 1 M 1 1 1 1 II 1 1 i I 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I M I i 1 1 1 II I I 

or f 6 1 - 1 MTVLKLSHWRVLAELADGLPQHVSQLARMADMKPQQLNGFWQQMPAHI 

10 20 30 40 50 60 

10 70 80 ' 90 100 110 120 

orf61a.pep LVRPLAVFDAEGLRELGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK 

I II M II II II II II I 1 1 II I , II 1 1 1 II II II II II II II I II 1 1 1 II II II II 1 1 

orf 61-1 LVRPLAVFDAEGLRELGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK 

70 80 90 100 110 120 

15 130 140 150 160 170 180 

orf 61a . pep GRGRQGRKWSHRLGECLMFSFGWVFDRPQYELGSLSPVAAVACRRALSRLGLKTQIKWPN 

i ii ii ii ii ii ii ii 1 1 ii 1 1 1 n ii m ii ii 1 1 ii 1 1 ii 1 1 1 1 1 1 in mim i 

orf 61-1 GRGRQGRKWSHRLGECLMFSFGWVFDRPQYELGSLSPVAAVACRRALSRLGLDVQIKWPN 

130 140 150 160 170 180 

20 190 200 210 220 230 240 

orf 61a . pep DLWGRDKLGGILIETVRTGGKTVAWGIGINFVLPKEVENAASVQSLFQTASRRGNADA 

I I I I I I I I I I I I I I I I I I II I II I . I I I I I I I I I I I Ml II I I II I I I I I I I I I I I I I 
orf 61-1 DLWGRDKLGGI L I ETVRTGGKTVAWG I G INFVLPKEVENAAS VQSLFQTASRRGNADA 

190 200 210 220 230 240 

25 250 260 270 280 290 300 

orf 61a . pep AVLLETLLAELDAVLLQYARDGFAPFVAEYQAANRDHGKAVLLLRDGETVFEGTVKGVDG 

' I I I II I h III I I I I I I I I I I I I I I I I I I I I I I I II I I I II M I II I I I I M I I I I 
or f 6 1 - 1 AVLLETLLVELDAVLLQYARDGFAPFVAEYQAANRDHGKAVLLLRDGETVFEGTVKGVDG 

250 260 270 280 290 300 

30 310 320 330 340 350 360 

orf 61a . pep QGVLHLETAEGKQTWSGEISLRSDDRPVSVPKRRDSERFLLLDGGNSRLKWAWVENGTF 

1 1 II 1 1 II I M 1 1 1 1 1 1 1 1 MM 1 1 I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 

orf 61 - 1 QGVLHLETAEGKQTWSGEISLRSDDRPVSVPKRRDSERFLLLDGGNSRLKWAWVENGTF 

310 320 330 340 350 360 

35 370 380 390 400 410 420 

orf 61a . pep ATVGSAPYRDLSPLGAEWAEKVDGNVRIVGCAVCGEFKKAQVQEQLARKIEWLPSSAQAL 

II MM I II II 1 1 III II II Ml IMIIMI II II MIMI lllllll 1 1 1 IIIMIMI 

or f 6 1 - 1 ATVGSAPYRDLSPLGAEWAEKADGNVRI VGCAVCGEFKKAQVQEQLARKIEWLPSSAQAL 

370 380 390 400 410 420 

40 430 440 ' 450 460 470 480 

orf 61a . pep GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVWSCGTAVTVDALTDDGHYLGGTIMPGF 

IMIIIMIIIMIIMMIIIMM I I I IMMIIIIIIIMIIIIIIIIIIM 

or f 6 1 - 1 G I RNH YRH PEEHGS DRW FNALGS RRFS RNACVWS CGTAVT VDALTDDGHYLGGT I M PGF 

430 440 450 460 470 480 

45 490 500 510 520 530 540 

orf 61a. pep HLMKESI^VRTANLNRHAGKRYPFPTTTGNAVASGIVM 

I II I M 1 1 M 1 1 II 1 1 1 1 M II II II II M I M M I II I M 1 1 II 1 1 M 1 1 M II MM 

orf 61-1 HLMKESLAVRTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMMHGRLKEKTGAGKP 
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490 500 510 520 530 540 

550 560 570 580 590 

orf 61a . pep VDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIHGLLNLIAAEGGESEHTX 

1 [Mil ! I I I I M ! I I I I I : I I I I : I I I I 1 I || 

orf 61 - 1 VDVI I TGGGAAKVAEALP PAFIAENTVRVADNLVI YGLLNM I AAEGREYEH I X 

550 560 570 580 590 

Homology with a predicted ORF from N. gonorrhoeae 

ORF61 (SEP ID NO: 232) shows 94.2% identity over a 189aa overlap with a predicted ORF 
(ORF61 .ng) (SEP ID NO: 238) from N. gonorrhoeae: 

orf 61 pep E I SLRSDXRPVS VXKRRDSERFLLLDGGNS 30 

Mill I I Ml II lllllllhllll 

orf 61ng TVCEGTVKGVDGRGVLHLETAEGEQTWSGEISLRPDNRSVSVPKRPDSERFLLLEGGNS 211 

orf 61 . pep RLKWAWVENGTFATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGEFKKAQVQEQLAR 90 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 Mllhlllll 

orf 6 lng RLKWAWVENGTFATVGSAP YRDLS PLGAEWAEKADGNVRI VGCAVCGES KKAQVKEQLAR 271 

orf 61 .pep KIEWLPSSAQAXGIRNHYRHPEEHGSDRWFNALGSRRFSRNACVWSCGTAVTVDALTDD 150 

IMIMIMM IMMIIMMMIMMMMMMMMIIMMMMIIIMIII 

orf61ng KIEWLPSSAQALGIRNHYRHPEEHGSDRWFNALGSRRFSRNACVWSCGTAVTVDALTDD 331 

orf 61 .pep GHYLGXGTIMPGFHLMKESLAVRTANLNRHAGKRYPFPT 189 

Mill II I M I M 1 1 1 M 1 1 1 II .1 1 1 IIIIIMM 

orf61ng GHYLG-GTIMPGFHLMKESLAVRTANLNRPAGKRYPFPTTTGNAVASGMMDAVCGSIMMM 3 90 

An PRF61ng nucleotide sequence [<SEQ ID 237>] (SEP ID NO: 237) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 238>] fSEPIDNP: 238) : 



1 MFSFGWAFDR PQYEL GSLSP VAALAC RRAL GCLGLETQIK WPNDLWGRD 

51 KLGGILIETV RAGGKTVAW GIGINFVLPK EVENAASVQS LFQTASRRGN 

101 ADAAVLLETL LAELGAVLEQ YAEEGFAPFL NEYETANRDH GKAVLLLRDG 

151 ETVCEGTVKG VDGRGVLHLE TAEGEQTWS GEISLRPDNR SVSVPKRPDS 

201 ERFLLLEGGN SRLKWAWVEN GTFATVGSAP YRDLS PLGAE WAE KADGNVR 

251 IVGCAVCGES KKAQVKEQLA RKIEWLPSSA QALGIRNHYR HPEEHGSDRW 

301 FNALGSRRFS RNACVWSCG TAVTVDALTD DGHYLGGTIM PGFHLMKESL 

351 AVRTANLNRP AGKRYPFPTT TGNAVASGMM DAVCGSIMMM HGRLKEKNGA 

401 GKPVDVIITG GGAAKVAEAL PPAFLAENTV RVADNLVIHG LLNLIAAEGG 

4 51 ESEHA* 

Further analysis revealed the complete gonococcal DNA sequence [<SEQ ID 239>] (SEP ID NO: 
239) to be: 



1 ATGACGGTTT TGAAGCCTTC GCATTGGCGG GTGTTGGCGG AGCTTGCCGA 

51 CGGTTTGCCG CAACACGTAT CGCAATTGGC GCGTGAGGCG GACATGAAGC 

101 CGCAGCAGCT CAACGGTTTT TGGCAGCAGA TGCCGGCGCA TATACGCGGG 

151 CTGTTGCGCC AACACGACGG CTATTGGCGG CTGGTGCGCC CCTTGGCGGT 

201 TTTCGATGCC GAAGGTTTGC GCGATCTGGG GGAAAGGTCG GGTTTTCAGA 

251 CGGCATTGAA GCACGAGTGC GCGTCCAGCA ACGACGAGAT ACTGGAATTG 
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301 GCGCGGATTG CGCCGGACAA GGCGCACAAA ACCATATGCG TGACCCACCT 

3 51 GCAAAGTAAG GGCAGGGGGC GGCAGGGGCG GAAGTGGTCG CACCGTTTGG 

4 01 GCGAGTGCCT GATGTTCAGT TTCGGCTGGG CGTTTGACCG GCCGCAGTAT 
4 51 GAGTTGGGTT CGCTGTCGCC TGTTGCGGCA CTTGCGTGCC GGCGCGCTTT 
501 GGGGTGTTTG GGTTTGGAAA CGCAAATCAA GTGGCCAAAC GATTTGGTCG 
551 TCGGACGCGA CAAATTGGGC GGCATTCTGA TTGAAACAGT CAGGGCGGGC 
601 GGTAAAACGG TTGCCGTGGT CGGTATCGGC ATCAATTTCG TGCTGCCCAA 
651 GGAAGTGGAA AACGCCGCTT CCGTGCAGTC GCTGTTTCAG ACGGCATCGC 
701 GGCGGGGCAA TGCCGATGCC GCCGTATTGC TGGAAACATT GCTTGCGGAA 
751 CTGGGCGCGG TGTTGGAACA ATATGCGGAA GAAGGGTTCG CGCCATTTTT 
801 AAATGAGTAT GAAACGGCCA ACCGCGACCA CGGCAAGGCG GTATTGCTGT 
851 TGCGCGACGG CGAAACCGTG TGCGAAGGCA CGGTTAAAGG CGTGGACGGA 
901 CGAGGCGTTC TGCACTTGGA AACGGCAgaa ggcgaACAGa cggtcgtcag 
951 cggcgaaaTC AGcctGCggc ccgacaacaG GTCGGtttcc gtgccgaagc 

1001 ggccggatTC GgaacgtTTT tTGCtgttgg aaggcgggaa cagccgGCTC 

1051 AAGTGGGCGT GggtggAAAa cggcacgttc gcaaccgtgg gcagcgcgCc 

1101 gtaCCGCGAT TTGTCGCCTT TGGGCGCGGA GTGGGCGGAA AAGGCGGATG 

1151 GAAATGTCCG CATCGTCGGT TGCGCCGTGT GCGGAGAATC CAAAAAGGCA 

1201 CAAGTGAAGG AACAGCTCGC CCGAAAAATC GAGTGGCTGC CGTCTTCCGC 

12 51 ACAGGCTTTG GGCATACGCA ACCACTACCG CCACCCCGAA GAACACGGTT 

13 01 CCGACCGTTG GTTCAACGCC TTGGGCAGCC GCCGCTTCAG CCGCAACGCC 
1351 TGCGTCGTCG TCAGTTGCGG CACGGCGGTA ACGGTTGACG CGCTCACCGA 

14 01 TGACGGACAT TATCTCGGCG GAACCATCAT GCCCGGCTTC CACCTGATGA 
14 51 AAGAATCGCT CGCCGTCCGA ACCGCCAACC TCAACCGCCC CGCCGGCAAA 
1501 CGTTACCCTT TCCCGACCAC AACGGGCAAC GCCGTCGCAA GCGGCATGAT 
1551 GGACGCGGTT TGCGGCTCGA TAATGATGAT GCACGGCCGT TTGAAAGAAA 
1601 AAAACGGCGC GGGCAAGCCT GTCGATGTCA TCATTACCGG CGGCGGCGCG 
1651 GCGAAAGTCG CCGAAGCCCT GCCGCCTGCA TTTTTGGCGG AAAATACCGT 
1701 GCGCGTGGCG GACAACCTCG TCATCCACGG GCTGCTGAAC CTGATTGCCG 
1751 CCGAAGGCGG GGAATCGGAA CACGCTTAA 

This corresponds to the amino acid sequence [<SEQ ID 240; ORF61ng-l>] (SEP ID NO: 240; 
PRF61ng-l) : 



1 MTVLKPSHWR VLAELADGLP QHVSQLAREA DMKPQQLNGF WQQMPAHIRG 

51 LLRQHDGYWR LVRPLAVFDA EGLRDLGERS GFQTALKHEC ASSNDEILEL 

101 ARIAPDKAHK TICVTHLQSK GRGRQGRKWS HRLGECLMFS FGWAFDRPQY 

151 ELGSLSPVAA LACRRALGCL GLETQIKWPN DLWGRDKLG GILIETVRAG 

201 GKTVAWGIG INFVLPKEVE NAASVQSLFQ TASRRGNADA AVLLETLLAE 

251 LGAVLEQYAE EGFAPFLNEY ETANRDHGKA VLLLRDGETV CEGTVKGVDG 

3 01 RGVLHLETAE GEQTWSGEI SLRPDNRSVS VPKRPDSERF LLLEGGNSRL 

3 51 KWAWVENGTF ATVGSAPYRD LSPLGAEWAE KADGNVRIVG CAVCGESKKA 

4 01 QVKEQLARKI EWLPSSAQAL GIRNHYRHPE EHGSDRWFNA LGSRRFSRNA 
4 51 CVWSCGTAV TVDALT DDGH YLGGTIMPGF HLMKESLAVR TANLNRPAGK 
501 RYPFPTTTGN AVASGMMDAV CGSIMMMHGR LKEKNGAGKP VDVIITGGGA 
551 AKVAEALPPA FLAENTVRVA DNLVIHGLLN LIAAEGGESE HA* 

ORF61ng-l (SEP ID NO: 240) and ORF61-1 (SEP ID NO: 234) show 93.9% identity in 591 aa 
overlap: 



orf 61ng- 1 . pep MTVLKPSHWRVLAELADGLPQHVSQLAREADMKPQQLNGFWQQMPAHIRGLLRQHDGYWR 60 

III I I I I II I I I II I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
orf 61- 1 MTVLKLSHWRVLAELADGLPQHVSQLARMAD^ 60 

orf 61ng-l .pep LVRPLAVFDAEGLRDLGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK 120 

II II M II M II M II II MM II II I! II II II II M 1 1 II I M 1 1 M I M I M 1 1 
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orf61-l LVRPLAVFDAEGLRELGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK 120 

orf 61ng-l .pep GRGRQGRKWSHRLGECLMFSFGWAFDRPQYELGSLSPVAALACRRALGCLGLETQIKWPN 180 

IMMII IIIIIMMIIIIIIMIIIIIIIIIMIIIM llllh MhMMMI 

orf 61 - 1 GRGRQGRKWSHRLGECLMFSFGWVFDRPQYELGSLSPVAAVACRRALSRLGLDVQIKWPN 180 

5 orf 61ng-l .pep DLWGRDKLGGILIETVRAGGKTVAVVGIGINFVLPKEVENAASVQSLFQTASRRGNADA 240 

Ml MINIM IIMIIMIIMII MIMIIMMIIIIIIMIIIIIIIIIMMI! 

orf 61-1 DLWGRDKLGGILIETVRTGGKTVAWGIGINFVLPKEVENAASVQSLFQTASRRGNADA 24 0 

orf 61ng-l .pep AVLLETLLAELGAVLEQYAEEGFAPFLNEYETANRDHGKAVLLLRDGETVCEGTVKGVDG 300 

Illllllhll III Mi- MM: ||::|||||||||||||||||| lllllllll 
10 or f 6 1 - 1 AVLLETLLVELDAVLLQYARDGFAPFVAEYQAANRDHGKAVLLLRDGETVFEGTVKGVDG 3 00 

orf 61ng-l .pep RGVLHLETAEGEQTWSGE I SLRPDNRSVSVPKRPDSERFLLLEGGNSRLKWAWVENGTF 360 

MMMMMMMMMMMI hi MM 1 1 1 1 1 1 1 h 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 

orf 61- 1 QGVLHLETAEGKQTWSGE I SLRSDDRPVSVPKRRDSERFLLLDGGNSRLKWAWVENGTF 360 

orf 61ng-l .pep ATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGESKKAQVKEQLARKIEWLPSSAQAL 420 

15 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II h 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 61-1 ATVGS APYRDLS PLGAEWAEKADGNVRI VGCAVCGEFKKAQVQEQLARKI EWLPSSAQAL 420 

orf 61ng-l .pep GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVWSCGTAVTVDALTDDGHYLGGTIMPGF 4 80 

Illlllllllllllll I MM I II Illlllllllllllll IMMII llllll II II III 

orf 61 - 1 GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVWSCGTAVTVDALTDDGHYLGGTIMPGF 4 80 

20 orf 61ng-l .pep HLMKESLAVRTANLNRPAGKRYPFPTTTGNAVASGMMDAVCGSIMMMHGRLKEKNGAGKP 540 

Illlllllllllllll M I MM M M MMMM Ml M M M 1 1 M M M M I 1 1 

orf 61-1 HLMKESLAWTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMMHGRLKEKTGAGKP 54 0 

orf 61ng-l .pep VD V 1 1 TGGGAAKVAE AL P P AFLAENT VRVADNLV I HGLLNL I AAEGGE S EHAX 593 

I II I I II I II I II II II I I I II II I I I I II I I I I h II I M M I I I I II 
25 orf 61-1 VDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIYGLLNMIAAEGREYEHIX 593 

Based on this analysis, including the homology with the baf protein (SEP ID NO: 1 127) of 
B.pertussis and the presence of a putative prokaryotic membrane lipoprotein lipid attachment site, 
it is predicted that these proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could 
be useful antigens for vaccines or diagnostics, or for raising antibodies. 

30 Example 29 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 241>] (SEP ID 
NO: 241) : 

1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGAGCAGCT CGTTTATTGC 

51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC 

35 101 GCCTGCTAAT TGCCGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC 

151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT 

201 CAACTATGTG CTGACCCTGC TGCTTCAGTT TGTCGGGTTG AAATACACTT 

251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCGCT GCTGATGGTG 

301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT 
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351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG 

4 01 CGGaAGAGGG CGGCGaAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG 

4 51 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC 

501 ACGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT 

551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC 

601 TGGAGCGTCG GGATGGTATT GTCGCTGCTG TATTTGGGTT TGGGGTGC . . 

This corresponds to the amino acid sequence [<SEQ ID 242; ORF62>] (SEO ID NO: 242; 
ORF62) : 



1 MFYQILALII WSSSFIAAKY VYGGIDPALM VGVRLLIAAL PALPACRRHV 

51 GKIPREEWKP LLIVSFVNYV LTLLLQFVGL KYTSAASASV IVGLEPLLMV 

101 FVGHFFFNDK ARAYHWICGA AAFAGVALLM AGGAEEGGEV GWFGCLLVLL 

151 AGAGFCAAMR PTQRLIARIG APAFTSVSIA AASLMCLPFS LALAQSYTVD 

201 WSVGMVLSLL YLGLGC . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 243>] (SEO ED NO: 243) : 



1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGAGCAGCT CGTTTATTGC 

51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC 

101 GCCTGCTAAT TGCCGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC 

151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT 

2 01 CAACTATGTG CTGACCCTGC TGCTTCAGTT TGTCGGGTTG AAATACACTT 

2 51 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCGCT GCTGATGGTG 
301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT 

3 51 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG 

4 01 CGGAAGAGGG CGGCGAAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG 
4 51 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC 
501 ACGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT 
551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC 
601 TGGAGCGTCG GGATGGTATT GTCGCTGCTG TATTTGGGTT TGGGGTGCGG 
651 CTGGTACGCC TATTGGCTGT GGAACAAGGG GATGAGCCGT GTTCCTGCCA 
701 ATGTTTCGGG ACTGTTGATT TCGCTCGAAC CCGTCGTCGG CGTGCTGCTG 
751 GCGGTTTTGA TTTTGGGCGA ACACCTGTCG CCCGTGTCCG CCTTGGGCGT 
801 GTTTGTCGTC ATCGCCGCCA CCTTGGTTGC CGGCCGGCTG TCGCATCAAA 
851 AATAA 

This corresponds to the amino acid sequence [<SEQ ID 244; ORF62-l>] (SEO ID NO: 244; 
ORF62-1) : 



1 MFYQILALII WSSSFIAAKY VYGGIDPALM VGVRLLIAAL PALPACRRHV 

51 GKIPREEWKP LLIVSFVNYV LTLLLQFVGL KYTSAASASV IVGLEPLLMV 

101 FVGHFFFNDK ARAYHW ICGA AAFAGVALLM AGGA EEGGEV GW FGCLLVLL 

151 AGAGFCAAM R PTQRLIARIG APAFTS VSIA AASLMCLPFS LALA QSYTVD 

201 WSVGMVLSLL YLGLGCGWYA YWLWNKGMSR VPANVSG LLI SLEPWGVLL 

251 AVLILGEHLS PVSALGVFW IAATLVAGRL SHQK* 



Computer analysis of this amino acid sequence gave the following results: 



Homology with hypothetical transmembrane protein HI0976 of H. influenzae (accession number 
057 147) (SEO ID NO: 1128) 
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ORF62 (SEP ID NO: 242) and HI0976 (SEP ID NO: 1128) show 50% aa identity in 114aa 
overlap: 

Orf62 1 MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRXXXXXXXXXXXCRRHVGKIPREEWKP 60 
M YQILAL+IWSSS IKY +DP L+V VR R KI + K 

5 HI0976 1 MLYQILALLIWSSSLIVGKiTYSMMDPVLVVQVRLIIAMIIVMPLFLRRWKKIDKPMRKQ 60 

Orf62 61 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAY 114 

L ++F NY LLQF+GLKYTSA+SA + +GLEPLL+ VFVGHFFF K . + 
HI0976 61 LVJWLAFFNYTAVFLLQFIGLKYTSASSAVTMIGLEPLLWFVGHFFFKTKQNGF 114 

Homology with a predicted ORF from N. meningitidis (strain A) 

10 ORF62 (SEP ID NO: 242) shows 99.5% identity over a 216aa overlap with an ORF (ORF62a) 
(SEP ID NO: 246) from strain A of N. meningitidis: 



15 



10 20 30 40 50 60 

orf 62 . pep MFYQI LALI IWSSSFIA AKYVYGGID PALMVGVRLLIAALPAL PACRRHVGKI PREEWKP 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I li I I I I I I I I I M I I I I I I I I I I I I I I 

orf 62a MFYQI LALI IWSSSFIA AKYVYGGID PALMVGVRLLIAALPAL PACRRHVGKI PREEWKP 

10 20 30 40 50 60 



20 



70 80 90 100 110 120 

orf 62 . pep LLIVS FVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 

MMIIIIIMMIII lllllill llllllll IIMIIIMMIIIIIIJIIIIII 

orf 62a LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 

70 80 90 100 110 120 



25 



130 140 150 160 170 180 

orf 62 . pep AAFAGVALLMAGGA EEGGEVGW FGCLLVLLAGAGFCAAM RPTQRLIARIGAPAFTS VSIA 

1 1 i I II I II 1 1 1 1 1 1 1 1 1 1 1 1 II , 1 1 1 II 1 1 hi 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 62a AAFAGVALLMAGGA EEGGEVGW FGCLLVLLAGAGFCAAM RPTQRLIARIGAPAFTS VSIA 

130 140 150 160 170 180 



30 



190 200 210 

orf 62 . pep AASLMCLPFSLALA QSYTVDWSVGMVLSLLYLGLGC 

IIIMII IIIIIMIUIMIIIIllMlllh I 

orf 62a AASLMCLPFSIALA QSYTVDWSVGMVLSLLYLGVGCSWYAYWLWNKGMSRVPANVSGLLI 

190 200 210 220 230 240 



35 



orf 62a SLE P WGVLLAVL I LGEHLS P VS VLGVFWI AATLVAGRLSHQKX 

250 260 270 280 

The complete length PRF62a nucleotide sequence [<SEQ ID 245>] (SEP ID NP: 245) is: 



40 



1 


ATGTTTTACC 


51 


CGCCAAATAT 


101 


GCCTGCTGAT 


151 


GGCAAGATTC 


201 


CAACTATGTG 


251 


CCGCCGCCAG 


301 


TTTGTCGGAC 


351 


ATGCGGCGCG 



AAATCCTTGC 
GTCTATGGCG 
TGCTGCGCTG 
CGCGTGAGGA 
CTGACCCTGC 
CGCATCGGTC 
ACTTTTTCTT 
GCGGCATTTG 



CCTGATTATC 
GCATCGATCC 
CCTGCACTGC 
ATGGAAGCCG 
TACTTCAGTT 
ATTGTCGGAC 
CAACGACAAA 
CCGGTGTCGC 



TGGAGCAGCT 
CGCATTGATG 
CCGCCTGCCG 
TTGCTGATTG 
TGTCGGGTTG 
TCGAGCCACT 
GCGCGTGCCT 
GCTGCTGATG 



CGTTTATTGC 
GTCGGCGTGC 
CCGTCATGTC 
TGTCGTTCGT 
AAATACACTT 
GCTGATGGTG 
ACCACTGGAT 
GCGGGCGGTG 
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4 01 CGGAAGAGGG CGGCGAAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG 

4 51 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC 

501 ACGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT 

551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC 

601 TGGAGCGTCG GAATGGTATT GTCGCTGCTG TATTTGGGCG TGGGGTGCAG 

651 CTGGTACGCC TATTGGCTGT GGAACAAGGG GATGAGCCGT GTTCCTGCCA 

701 ACGTTTCGGG ACTGTTGATT TCGCTCGAAC CCGTCGTCGG CGTGCTGCTG 

751 GCGGTTTTGA TTTTGGGCGA ACACCTGTCG CCCGTGTCCG TCTTGGGCGT 

801 GTTTGTCGTC ATCGCCGCCA CCTTGGTTGC CGGCCGGCTG TCGCATCAAA 

851 AATAA 

This encodes a protein having amino acid sequence [<SEQ ID 246>] (SEP ID NO: 246) : 

1 MFYQILALI I WSSSFIAAKY VYGGIDPALM VGVRLLIAAL PALPACRRHV 

51 GKIPREEWKP LLIVSFVNYV LTLLLQFVGL KYTSAASASV IVGLEPLLMV 

101 FVGHFFFNDK ARAYHW ICGA AAFAGVALLM AGGA EEGGEV GW FGCLLVLL 

151 AGAG FCAAM R PTQRLIARIG APAFTS VSIA AASLMCLPFS LALA QSYTVD 

2 01 WSVGMVLSLL YLGVGCSWYA YWLWNKGMSR VPANVSG LLI SLEPWGVLL 

2 51 AVLI LGEHLS P VSVLGVFW IAATLVAG RL SHQK* 

ORF62a (SEP ID NO: 246) and ORF62-1 (SEP ID NO: 244) show 98.9% identity in 284 aa 
overlap: 

orf 62a . pep MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 60 

IIIIIMII Mil IIIIIIMIIMIMI IMMIII IIMIIIIMIIIIII 

orf 62-1 MFYQILALI IWSSSFI AAKYVYGGIDPALMVGVRLLI AALPALPACRRHVGKI PREEWKP 60 

orf 62a .pep LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120 

1 1 1 1 1 1 M M I II I I M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i I II M I 

orf 62 - 1 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120 

orf 62a . pep AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 180 

1 1 II IIIIIMIIIIIIMI Mill I lllllllll I II II I INI III III lllllllll 

orf 62 - 1 AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRL IARIGAPAFTS VS I A 180 

orf 62a . pep AASLMCLPFS LALAQS YTVDWSVGMVLSLL YLGVGCSWYA YWLWNKGMSRVPANVSGLL I 240 

II II II II III II III 11.11 1 II III I Ml II Ml Ml III 1 1 II 1 1 Mill Mill III 

orf 62-1 AASLMCLPFS LALAQS YTVDWSVGMVLSLLYLGLGCGWYA YWLWNKGMSRVPANVSGLL I 240 

orf 62a .pep S LEP WGVLLAVL I LGEHLS PVSVLGVFW I AATLVAGRLSHQKX 285 

IIIIMIM MM IIMIIMIIIMIIMIIIIIIIII II 

orf 62 - 1 SLEPWGVLLAVL I LGEHLS PVSALGVFW I AATLVAGRLSHQKX 285 

Homology with a predicted PRF from N. gonorrhoeae 

PRF62 (SEP ID NP: 242) shows 99.5% identity over a 216aa overlap with a predicted PRF 
(PRF62.ng) (SEP ID NP: 248) from N. gonorrhoeae: 

orf 62 .pep MFYQILALI IWSSSFI AAKYVYGGIDPALMVGVRLLI AALPALPACRRHVGKI PREEWKP 60 

1 1 1 1 1 1 II MM II 1 1 1 1 1 1 1 1 1 1 1 1 M I M 1 1 1 M 1 1 1 1 II I II II M 1 1 II 1 1 M 

orf 62ng MFYQ I LAL 1 1 WGSSF I AAKYVYGGIDPALMVGVRLLI AALPALPACRRHVGKI PREEWKP 60 
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orf62.pep LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120 

Illlll IIIIIIIIIIIIIIIMII MIIIIMIIIIIIIMIIIIMIMMMIIIM 

orf 62ng LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120 

orf 62 .pep AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 180 

Ml IMIIIMIMIIIIIMMII MMMMMMMIMMIMM IMIMI 

or f 6 2 ng AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRL I AR I GAPAFTS VS I A 180 

orf 62 .pep AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGC 216 

IIIIMIIIIIIIIIIMIIIIIIIMIIIIIIIII 

orf 62ng AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANASGLLI 240 

The complete length ORF62ng nucleotide sequence [<SEQ ID 247>] (SEP ID NO: 247) is: 



1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGGGCAGCT CGTTTATTGC 

51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC 

101 GCCTGCTGAT TGCCGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC 

151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT 

201 CAACTATGTG CTGACCCTGC TGCTTCAGTT TGTCGGGTTG AAATACACTT 

251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCGCT GCTGATGGTG 

301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT 

351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG 

4 01 CGGAAGAGGG CGGCGAAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG 

4 51 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC 

501 CCGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT 

551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC 

601 TGGAGCGTCG GGATGGTATT GTCGCTGTTG TATTTGGGTT TGGGGTGCGG 

651 CTGGTACGCC TATTGGCTGT GGAACAAGGG GATGAGCCGT GTTCCTGCCA 

701 ACGCGTCGGG ACTGTTGATT TCGCTCGAAC CCGTCGTCGG CGTGCTGTTG 

751 GCGGTTTTGA TTTTGGGCGA ACATTTATCG CCCGTGTCCG CCTTGGGCGT 

801 GTTTGTCGTC ATCGCCGCCA CTTTCGCCGC CGGCCGGCTG TCGCGCAGGG 

851 ACGCGCAAAA CGGCAATGCC GTCTGA 

This encodes a protein having amino acid sequence [<SEQ ID 248>] (SEP ID NO: 248) : 



1 MFYOILALII WGSSFIA AKY VYGGID PALM VGVRLLIAAL PAL PACRRHV 

51 GKIPREEWKP LLIVSFVNYV LTLLLQFVGL KYTSAASASV IVGLEPLLMV 

101 FVGHFFFNDK ARAYHW I CGA AAFAGVALLM AGG AEEGGEV GW FGCLLVLL 

151 AGAGFCAAM R PTQRLIARIG APAFTS VSIA AASLMCLPFS LALA QSYTVD 

201 WSVGMVLSLL YLGLGCGWYA YWLWNKGMSR VPANASG LL I SLEPWGVLL 

251 AVLI LGEHLS P VSALGVFW IAATFAAG RL SRRDAQNGNA V* 

ORF62ng (SEP ID NO: 248) and ORF62-1 fSEO ID NO: 244) show 97.9% identity in 283 
overlap: 



10 20 30 40 50 60 

orf 62ng.pep MFYQILALIIWGSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 

M 1 1 1 M M M M 1 1 1 1 1 1 M II i 1 1 M 1 1 M II M I II 1 1 II M M 1 1 1 1 M M 1 1 

orf 62 -1 MFYQILALI IWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 62ng . pep LLIVS FVNYVLTLLLQFVGLKYTSAAS ASVIVGLEPLLMVFVGHFFFND KARA YHWI CGA 

.lllllll II I IIIIIIIIIIIIIIIIIIIIN hllllllllllNI 

orf 62 - 1 LLIVS FVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 

70 80 90 100 110 120 
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10 



15 



20 



25 



orf 62ng .pep 
orf 62-1 

orf 62ng .pep 
orf62-l 

orf 62ng .pep 
orf 62-1 



130 140 150 160 170 180 

AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 

1 1 1 1 1 1 1 1 1 1 I i I i M I II 1 1 1 1 II 1 1 1 1 1 1 1 1 II I 1 1 1 1 I M I ! I 1 1 I : I ! 1 1 1 1 1 I 

AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 
130 140 150 160 170 180 



220 



230 



240 



190 200 210 

AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANASGLLI 

IIIIIMIII I IMIIIIIIIIIIIMIIIIIIIIIIIIIIIIMIM hllll 

AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANVSGLLI 
190 200 210 220 230 240 

250 260 270 280 290 

SLEPWGVLLAVLILGEHLSPVSALGVFWIAATFAAGRLSRRDAQNGNAVX 

|llllllllll!lllllllllllllllllllll|::Mllh: 
SLEPWGVLLAVLILGEHLSPVSALGVFWIAATLVAGRLSHQKX 

250 260 270 280 



Furthermore, ORF62ng (SEP ID NO: 248) shows significant homology to a hypothetical 
^influenzae protein (SEP ID NO: 1128) : 

sp|Q57147 |Y976__HAEIN HYPOTHETICAL PROTEIN HI0976 ) gi | 107458 9 | pir | | B64 163 
hypothetical protein HI0976 - Haemophilus influenzae (strain Rd KW20) 
)gi | 1574004 (U32778)' hypothetical [Haemophilus influenzae] Length = 128 

Score = 106 bits (262), Expect = 2e-22 

Identities = 56/114 (49%), Positives = 68/114 (59%) 

Query: 1 MFYQILALIIWGSSFIAAKYVYGGIDPALMVGVRXXXXXXXXXXXCRRHVGKIPREEWKP 60 

M YQILAL+IW SS I K Y +DP L+V VR R KI + K 

Sbjct: 1 MLYQILALLIWSSSLIVGKiTYSMMDPVLVVQVRLIIAMIIVMPLFLRRWKKIDKPMRKQ 60 

Query: 61 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAY 114 

L ++F NY LLQF+GLKYTSA+SA ++GLEPLL+VFVGHFFF K + 
Sbjct: 61 LWWLAFFNYTAVFLLQFIGLKYTSASSAVTMIGLEPLLWFVGHFFFKTKQNGF 114 



30 Based on this analysis, including the homology with the transmembrane protein (SEP ID NO: 
1128) of H.influenzae and the putative leader sequecne and several transmembrane domains in the 
gonococcal protein, it is predicted that these proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 30 

35 The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 249>] (SEQ ID 
NO: 249) : 



40 



1 ATGCGCCGTT TTCTACCGAT CGCAGCCATA TGCGCmGwms TCCTGkkGTA 

51 sGGACTGACG GCGGCAACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT 

101 GGTGGATTGT TGCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT 

151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCG ACGGCGTATT 

201 CGGTTCGCtA srTyGCCAAA gsGCCTgkks TGGG . ATGTT TACGCTGGTT 
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251 GCCGkACTGC CCGGCGTGTT TCTGTTCGGC TTTCCCGCAC AGTTCATCAA 

301 CGGCACGATT AATTCGTGGT TCGGCAACGA TACCCACGAG GCGCTTGAAC 

351 GCAGCCTCAA TTTGAGCAAG TCCGCATTGA ATTTGGCGGC AGACAACGCC 

4 01 CTCGGCAACG CCGTCCCCGT GCAGATAGAC CTCATCGGCG CGGCTTCCCT 

4 51 GCCCGGGGAT ATGGGCAGGG TGCTGGAACA TTACGCCGGC AGCGGTTTTG 

501 CCCAGCTTGC CCTGTACAAy ksCGCAAGCG GCAAAATCGA AAAAAGCATC 

551 AACCCGCACA AGCTCGATCA GCCGTTTCCA GGTAAGGCGC GTTGGGAaAa 

601 AATCCaACGG GCGGGTTCGG TCAGGGATTT GGAAAGCATA GGCGGCGTAT 

651 TGTaCGCGCA GGGCTGGCTG TCGGCGGGTA CGCACwACGG GCGCGATTAC 

701 GCCTTGTTTT TCCGTCAGCC GGTTCCCAAA GGCGTGGCAG AGGATGCCGT 

751 yTTAATCGAA AAGGCAAGGG CGAAATATGC TGAGTTGAGT TACAGCAAAA 

801 AAGGTTTGCA GACCTTTTTC CTGGCAACCC TGCTGATTGC CTCGCTGCTG 

851 TCGATTTTTC TTGCACTGGT CATGGCACTG TATTTCGCCC GCCGTTTCGT 

901 CGAACCCGTC CTATCGCTTG CCGAGGGGGC GAAGGCGGTG GCGCAAGGCG 

951 ATTTCAGCCA GACGCGCCCC GTGTTGCGCA ACGACGAGTT CGGACGCTTG 

1001 ACCArGTTGT TCAACCACAT GACCGAGCAG CTTTCCATCG CCAAAGATGC 

1051 AGACGAGCGC AACCGCCGGC GCGAGGAAGC CGCCAGGCAT TATCTTGAAT 

1101 GCGTGTTGGA GGGGCTGACC ACGGGCGTGG TGGTGTTTGA CGAACAAGGC 

1151 TGTCTGAAAA CCTTCAACAA AGCGGCGGGT ACC. . 

This corresponds to the amino acid sequence [<SEQ ID 250; ORF64>] (SEP ID NO: 250; 
ORF64) : 



1 MRRFLPIAAI CAXXLXXGLT AATGSTSSLA DYFWWIVAFS AMLLLVLSAV 

51 LARYVILLLK DRRDGVFGSX XAKXPXXXMF TLVAXLPGVF LFGFPAQFIN 

101 GTINSWFGND THEALERSLN LSKSALNLAA DNALGNAVPV QIDLIGAASL 

151 PGDMGRVLEH YAGSGFAQLA LYNXASGKIE KSINPHKLDQ PFPGKARWEK 

201 IQRAGSVRDL ESIGGVLYAQ GWLSAGTHXG RDYALFFRQP VPKGVAEDAV 

251 LIEKARAKYA ELSYSKKGLQ TFFLATLLIA SLLSIFLALV MALYFARRFV 

301 EPVLSLAEGA KAVAQGDFSQ TRPVLRNDEF GRLTXLFNHM TEQLSIAKDA 

3 51 DERNRRREEA ARHYLECVLE GLTTGVWFD EQGCLKTFNK AAGT. . 

Further work revealed the complete nucleotide sequence [<SEQ ID 25 1>] (SEP ID NO: 251) : 



1 ATGCGCCGTT TTCTACCGAT CGCAGCCATA TGCGCCGTCG TCCTGTTGTA 

51 CGGACTGACG GCGGCAACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT 

101 GGTGGATTGT TGCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT 

151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCG ACGGCGTATT 

2 01 CGGTTCGCAG ATTGCCAAAC GCCTTTCTGG GATGTTTACG CTGGTTGCCG 

251 TACTGCCCGG CGTGTTTCTG TTCGGCGTTT CCGCACAGTT CATCAACGGC 

301 ACGATTAATT CGTGGTTCGG CAACGATACC CACGAGGCGC TTGAACGCAG 

351 CCTCAATTTG AGCAAGTCCG CATTGAATTT GGCGGCAGAC AACGCCCTCG 

4 01 GCAACGCCGT CCCCGTGCAG ATAGACCTCA TCGGCGCGGC TTCCCTGCCC 

451 GGGGATATGG GCAGGGTGCT GGAACATTAC GCCGGCAGCG GTTTTGCCCA 

501 GCTTGCCCTG TACAATGCCG CAAGCGGCAA AATCGAAAAA AGCATCAACC 

551 CGCACAAGCT CGATCAGCCG TTTCCAGGTA AGGCGCGTTG GGAAAAAATC 

601 CAACGGGCGG GTTCGGTCAG GGATTTGGAA AGCATAGGCG GCGTATTGTA 

651 CGCGCAGGGC TGGCTGTCGG CGGGTACGCA CAACGGGCGC GATTACGCCT 

701 TGTTTTTCCG TCAGCCGGTT CCCAAAGGCG TGGCAGAGGA TGCCGTCTTA 

751 ATCGAAAAGG CAAGGGCGAA ATATGCTGAG TTGAGTTACA GCAAAAAAGG 

801 TTTGCAGACC TTTTTCCTGG CAACCCTGCT GATTGCCTCG CTGCTGTCGA 

851 TTTTTCTTGC ACTGGTCATG GCACTGTATT TCGCCCGCCG TTTCGTCGAA 

901 CCCGTCCTAT CGCTTGCCGA GGGGGCGAAG GCGGTGGCGC AAGGCGATTT 

951 CAGCCAGACG CGCCCCGTGT TGCGCAACGA CGAGTTCGGA CGCTTGACCA 

1001 AGTTGTTCAA CCACATGACC GAGCAGCTTT CCATCGCCAA AGAAGCAGAC 

1051 GAGCGCAACC GCCGGCGCGA GGAAGCCGCC AGGCATTATC TTGAATGCGT 

1101 GTTGGAGGGG CTGACCACGG GCGTGGTGGT GTTTGACGAA CAAGGCTGTC 

1151 TGAAAACCTT CAACAAAGCG GCGGAACAGA TTTTGGGGAT GCCGCTTACC 
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1201 CCCCTGTGGG GCAGCAGCCG GCACGGTTGG CACGGCGTTT CGGCGCAGCA 

1251 GTCCCTGCTT GCCGAAGTGT TTGCCGCCAT CGGCGCGGCG GCAGGTACGG 

1301 ACAAACCGGT CCATGTGAAA TATGCCGCGC CGGACGATGC CAAAATCCTG 

1351 CTGGGCAAGG CAACCGTCCT GCCCGAAGAC AACGGCAACG GCGTGGTAAT 

14 01 GGTGATTGAC GACATCACCG TTTTGATACA CGCGCAAAAA GAAGCCGCGT 

14 51 GGGGCGAAGT GGCGAAGCGG CTGGCACACG AAATCCGCAA TCCGCTCACG 

1501 CCGATCCAGC TTTCCGCCGA ACGGCTGGCG TGGAAATTGG GCGGGAAGCT 

1551 GGATGAGCAG GATGCGCAAA TCCTGACGCG TTCGACCGAC ACCATCGTCA 

1601 AACAGGTGGC GGCATTGAAG GAAATGGTCG AAGCATTCCG CAATTATGCG 

1651 CGTTCCCCTT CGCTCAAATT GGAAAATCAG GATTTGAACG CCTTAATCGG 

1701 CGATGTGTTG GCATTGTATG AAGCCGGTCC GTGCCGGTTT GCGGCGGAGC 

1751 TTGCCGGCGA ACCGCTGACG GTGGCGGCGG ATACGACCGC CATGCGGCAG 

1801 GTGCTGCACA ATATTTTCAA AAATGCCGCC GAAGCGGCGG AAGAAGCCGA 

1851 TGTGCCCGAA GTCAGGGTAA AATCGGAAAC AGGGCAGGAC GGTCGGATTG 

1901 TCCTGACGGT TTGCGACAAC GGCAAAGGGT TCGGCAGGGA AATGCTGCAC 

1951 AACGCCTTCG AGCCGTATGT AACGGACAAA CCGGCGGGAA CGGGATTGGG 

2001 TCTGCCTGTG GTGAAAAAAA TCATTGAAGA ACACGGCGGC CGCATCAGCC 

2051 TGAGCAATCA GGATGCGGGT GGCGCGTGTG TCAGAATCAT CTTGCCAAAA 

2101 ACGGTAAAAA CTTATGCGTA G 

This corresponds to the amino acid sequence [<SEQ ID 252; ORF64-l>] (SEP ID NO: 252; 
PRF64-1): 



1 MRRFLPIAAI CAWLLYGLT AATGSTSSLA DYFWWIVAFS AMLLLVLSAV 
51 LARYVILLL K DRRDGVFGSQ IAKRLS GMFT LVAVLPGVFL FGV SAQFING 
101 TINSWFGNDT HEALERSLNL SKSALNLAAD NALGNAVPVQ IDLIGAASLP 
151 GDMGRVLEHY AGSGFAQLAL YNAASGKIEK SINPHKLDQP FPGKARWEKI 

2 01 QRAGSVRDLE SIGGVLYAQG WLSAGTHNGR DYALFFRQPV PKGVAEDAVL 
251 IEKARAKYAE LSYSKKGLQT FFLAT LLIAS LLS I FLALVM AL YFARRFVE 
301 PVLSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNHMT EQLSIAKEAD 

3 51 ERNRRREEAA RHYLECVLEG LTTGVWFDE QGCLKTFNKA AEQILGMPLT 

4 01 PLWGSSRHGW HGVSAQQSLL AEVFAAIGAA AGTDKPVHVK YAAPDDAKIL 
451 LGKATVLPED NGNGWMVID DITVLIHAQK EAAWGEVAKR LAHEIRNPLT 
501 PIQLSAERLA WKLGGKLDEQ DAQILTRSTD TIVKQVAALK EMVEAFRNYA 
551 RSPSLKLENQ DLNALIGDVL ALYEAGPCRF AAELAGEPLT VAADTTAMRQ 
601 VLHNIFKNAA EAAEEADVPE VRVKSETGQD GRIVLTVCDN GKGFGREMLH 
651 NAFEPYVTDK PAGTGLGLPV VKKIIEEHGG RISLSNQDAG GACVRIILPK 
701 TVKTYA* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF64 (SEP ID NO: 250) shows 92.6% identity over a 392aa overlap with an ORF (ORF64a) 
(SEP ID NO: 254) from strain A of N. meningitidis: 



10 20 30 40 50 60 

orf 64 . pep MRRFLPIAAICAXXLXXGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK 

IIIIIIIIIIII I II Mil 1 1 1 II MM 1 1 Ml II II II I II II IIIIIIIIIIII 

' orf 64a MRRFLPIAAICAWLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 64 . pep DRRDGVFGSXXAKXPXX XMFTLVAXLPGVFLFG FPAQFINGTINSWFGNDTHEALERSLN 
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. Ill llllll II 

orf64a DRRDGVFGSQ I AKR 

70 



MINI MM III i 1 1 1 1 1 1 II II I M II I M M Mi 

•LS GMFTLVAVLPGVFLFGV SAQFINGTINSWFGNDTHEALERSLN 
80 90 100 110 



130 140 150 160 170 180 

orf 64 . pep LSKSALNLAADNALGNAVPVQIDLIGAASLPGDMGRVLEHYAGSGFAQLALYNXASGKIE 

MMMMMIMM MUM MM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 llllll 

orf 64a LSKSALNLAADNALGNAIPVQIDXIGAASLPXDMGRVLEHYAGSGFAQLALYNAASGKIE 
120 130 140 150 160 170 



190 200 210 220 230 240 

orf 64 . pep KSINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHXGRDYALFFRQP 

M M MM M M M I M M I MM M M I 1 1 1 1 1 1 1 E I Mill II 1 1 1 1 1 1 1 1 1 1 1 

orf 64a KSINPHKLDQPFPGKARWEKIQQAGSVRDXESIGGVLYAXGWLSAXTHNGRDYALFFRQP 
180 190 200 210 220 230 



250 260 270 280 290 300 

orf 64 . pep VPKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLAT LLIASLLSIFLALVMALY FARRFV 

Mil 1 1 1 1 1 - 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 II M 1 1 M II 1 1 1 1 II II I 

or f 6 4 a VP KGVAEDAVL I E KARAXXXXLS YS KKGLQTFFLAT LLIASLLSIFLALVMALY FARRFV 

240 250 260 270 280 290 



310 320 330 340 350 360 

orf 64 . pep EPVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTXLFNHMTEQLSIAKDADERNRRREEA 

MIIIIIIIMM MMMMMMIIMM MIIIMIIIIIIMIII llllll 

orf 64a EPVLSLAEGAKAVAQGDFSQTRPVLR^FDEFGRLTKLFNHMTEQLSIAKEADERNRRREEA 
300 310 320 330 340 350 



370 380 390 

orf 64 . pep ARHYLECVLEGLTTGVWFDEQGCLKTFNKAAGT 

I MMMMMMMMMMIMM MM 

orf 64a ARHYLECVLEGLTTGVWFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSL 
360 370 380 390 400 410 

orf 64a LAE VF AA I GAAAGTDKP VHVKYAAPDD AK I LLGKATVL PEDNXNGWMV I DD I TVL I HAQ 

420 430 440 450 460 470 

The complete length ORF64a nucleotide sequence [<SEQ ID 253>] (SEP ID NO: 253) is 



1 ATGCGCCGTT TTCTACCGAT CGCAGCCATA TGCGCCGTCG TCCTGTTGTA 

51 CGGACTGACG GCGGCAACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT 

101 GGTGGATTGT TGCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT 

151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCG ACGGCGTATT 

201 CGGTTCGCAG ATTGCCAAAC GCCTTTCCGG GATGTTTACG CTGGTTGCCG 

251 TACTGCCCGG CGTGTTTCTG TTCGGCGTTT CCGCACAGTT TATCAACGGC 

3 01 ACGATTAATT CGTGGTTCGG CAACGATACC CACGAGGCGC TTGAACGCAG 
351 CCTCAATTTG AGCAAGTCCG CATTGAATCT GGCGGCAGAC AACGCCCTTG 

4 01 GCAACGCCAT CCCCGTGCAG ATAGACNTCA TCGGCGCGGC TTCCCTGCCC 
451 NGGGATATGG GCAGGGTGCT GGAACATTAC GCCGGCAGCG GTTTTGCCCA 
501 GCTTGCCCTG TACAATGCCG CAAGCGGCAA AATCGAAAAA AGCATCAACC 
551 CGCACAAGCT CGATCAGCCG TTTCCAGGTA AGGCGCGTTG GGAAAAAATC 
601 CAACAGGCGG GTTCGGTCAG GGATNNGGAA AGCATAGGCG GCGTATTGTA 
651 CGCGCANGGC TGGCTGTCGG CAGNNACGCA CAACGGGCGC GATTACGCCT 
701 TGTTTTTCCG TCAGCCGGTT CCCAAAGGCG TGGCAGAGGA TGCCGTCTTA 
751 ATCGAAAAGG CAAGGGCGNA ANANNNTNAG TTGAGTTACA GCAAAAAAGG 
801 TTTGCAGACC TTTTTCCTNG CAACCCTGCT GATTGCCTCN CTGCTGTCGA 
851 TTTTTCTTGC ACTGGTCATG GCACTGTATT TCGCCCGCCG TTTCGTCGAA 
901 CCCGTCCTAT CGCTTGCCGA GGGGGCGAAG GCGGTGGCGC AAGGCGATTT 
951 CAGCCAGACG CGCCCCGTGT TGCGCAACGA CGAGTTCGGA CGCTTGACCA 
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1001 AGTTGTTCAA CCACATGACC GAGCAGCTTT CCATCGCCAA AGAAGCAGAC 

1051 GAGCGCAACC GCCGGCGCGA GGAAGCCGCC AGACATTATC TCGAATGCGT 

1101 GTTGGAGGGG CTGACCACGG GCGTGGTGGT GTTTGACGAA CAAGGCTGTC 

1151 TGAAAACCTT CAACAAAGCG GCGGAACAGA TTTTGGGGAT GCCGCTTACC 

5 1201 CCCCTGTGGG GCAGCAGCCG GCACGGTTGG CACGGCGTTT CGGCGCAGCA 

1251 GTCCCTGCTT GCCGAAGTGT TTGCCGCCAT CGGCGCGGCG GCAGGTACGG 

13 01 ACAAACCGGT CCATGTGAAA TATGCCGCGC CGGACGATGC CAAAATCCTG 

1351 CTGGGCAAGG CAACCGTCCT GCCCGAAGAC AACNGCAACG GCGTGGTAAT 

1401 GGTGATTGAC GACATCACCG TTTTGATACA CGCGCAAAAA GAAGCCGCGT 

10 1451 GGGGCGAAGT GGCAAAACGG CTGGCACACG AAATCCGCAA TCCGCTCACG 

1501 CCCATCCAGC TTTCTGCCGA ACGGCTGGCG TGGAAATTGG GCGGGAAGCT 

1551 GGACGAGCAN GACGCGCAAA TCCTGACACG TTCGACCGAC ACCATCATCA 

1601 AACAAGTGGC GGCATTAAAA GAAATGGTCG AGGCATTCCG CAATTACNCG 

1651 CGTTCCCCTT CGNCTCAATT GGAAAATCAG GATTTGAACG CCTTAATCGG 

15 1701 CGATGTGTTG GCATTGTACG AAGCTGGTCC GTGCCGGTTT GCGGCGGAAC 

1751 TTGCCGGCGA ACCGCTGATG ATGGCGGCGG ATACGACCGC CATGCGGCAG 

1801 GTGCTGCACA ATATTTTCAA AAATGCCGCC GAAGCGGCGG AAGAAGCCGA 

1851 TGTGCCCGAA GTCAGGGTAA AATCGGAAGC GGGGCAGGAC GGACGGATTG 

1901 TCCTGACAGT TTGCGACAAC GGCAAGGGGT TCGGCAGGGA AATGCTGCAC 

20 1951 AATGCCTTCG AGCCGTATGT AACGGACAAA CCGGCTGGAA CGGGATTGNG 

2001 ACTGCCCGTG GTGAAAAAAA TCATTGAAGA ACACGGCGGC CNCATCAGCC 

2051 TGAGCAATCA GGATGCGGGC GGCGCGTNTG TCAGAATCAT CTTGCCAAAA 

2101 ACGGTAGAAA CTTATGCGTA G 

25 This encodes a protein having amino acid sequence [<SEQ ID 254>] (SEP ID NO: 254) : 

1 MRRFLPIAAI CAWLLYGLT AATGSTSSLA DYFWWIVAFS AMLLLVLSAV 

51 LARYVILLL K DRRDGVFGSQ IAKRLS GMFT LVAVLPGVFL FGV SAQFING 

101 TINSWFGNDT HEALERS LNL SKSALNLAAD NALGNAIPVQ IDXIGAASLP 

151 XDMGRVLEHY AGSGFAQLAL YNAASGKIEK SINPHKLDQP FPGKARWEKI 

30 2 01 QQAGSVRDXE SIGGVLYAXG WLSAXTHNGR DYALFFRQPV PKGVAEDAVL 

251 IEKARAXXXX LSYSKKGLQT FFLAT LLIAS LLSIFLALVM ALY FARRFVE 

3 01 PVLSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNHMT EQLSIAKEAD 
351 ERNRRREEAA RHYLECVLEG LTTGVWFDE QGCLKTFNKA AEQILGMPLT 

4 01 PLWGSSRHGW HGVSAQQSLL AEVFAAIGAA AGTDKPVHVK YAAPDDAKIL 
35 451 LGKATVLPED NXNGWMVID DITVLIHAQK EAAWGEVAKR LAHEIRNPLT 

501 PIQLSAERLA WKLGGKLDEX DAQILTRSTD TIIKQVAALK EMVEAFRNYX 

551 RSPSXQLENQ DLNALIGDVL ALYEAGPCRF AAELAGEPLM MAADTTAMRQ 

601 VLHNIFKNAA EAAEEADVPE VRVKSEAGQD GRIVLTVCDN GKGFGREMLH 

651 NAFEPYVTDK PAGTGLXLPV VKKIIEEHGG XISLSNQDAG GAXVRIILPK 

40 701 TVETYA* 

ORF64a (SEP ID NO: 254) and ORF64-1 (SEP ID NO: 252) show 96.6% identity in 706 aa 
overlap: 

10 20 30 40 50 60 

45 orf 64a . pep MRRFLPIAAICAWLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK 

M 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I i 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 . 1 1 i 1 1 1 1 

orf 64- 1 MRRFLPIAAICAWLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK 

10 20 30 40 50 60 

70 80 ' 90 100 110 120 

50 orf 64a. pep DRRDGVFGSQ I AKRLSGMFTLVAVLPGVFLFGVSAQF I NGTINSWFGNDTHEALERSLNL 

MM IIIMIII III IIMIIII IMIIII IIIIMIl IIIIMI IMMII IMIMI! 

orf 64 - 1 DRRDGVFGSQ I AKRLSGMFTLVAVLPGVFLFGVSAQFINGTINSWFGNDTHEALERSLNL 

70 80 90 100 110 120 
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130 140 150 160 170 180 

orf 64a . pep S KS ALNLAADNALGNAI PVQ I DX I GAASLPXDMGRVLEHYAGSGFAQLALYNAASGKI EK 

I I I II I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I. I I I I I I 
orf 64 - 1 S KS ALNLAADNALGNAVPVQ I DL IGAASLPGDMGRVLEHYAGSGFAQLALYNAASGKI EK 

130 140 150. 160 170 180 



10 



190 200 210 220 230 240 

orf 64a. pep S INPHKLDQPFPGKARWEKIQQAGSVRDXES IGGVLYAXGWLSAXTHNGRDYALFFRQPV 

II II I I I II I II I I I I I I I Ml I I I M IMIIIIII Mill llllll lllllll 
orf 64 - 1 S INPHKLDQPFPGKARWEKIQRAGSVRDLES IGGVLYAQGWLSAGTHNGRDYALFFRQPV 

190 200 210 220 230 240 



15 



250 260 270 280 290 300 

orf 64a . pep PKGVAEDAVLIEKARAXXXXLSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFVE 

IIIIMIIIIIIIIII 1 1 M 1 1 M 1 1 1 1 1 II 1 1 1 1 1 II (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 64 - 1 PKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFVE 
250 260 270 280 290 300 



20 



310 320 330 340 350 360 

orf 64a. pep PVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEAA 

INI IIIMMMIIIIIIIIIII IIIIIIIIIIIIIIIIMIIIIIIM lllllll 

orf 64-1 PVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLS IAKEADERNRRREEAA 

310 320 330 340 350 360 



25 



370 380 390 400 410 420 

orf 64a . pep RHYLECVLEGLTTGVWFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSLL 

MM llllllllllllllllllll IIIIIIIIIIIIIIIIMIIIIIIM lllllll 

orf 64 - 1 RHYLECVLEGLTTGVWFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSLL 
370 380 390 400 410 420 



30 



430 440 450 460 470 480 

orf 64a . pep AEVFAA I GAAAGTDKPVHVKYAAPDDAKI LLGKATVLPEDNXNGWMVI DD I TVL I HAQK 

1 1 1 M 1 1 II 1 1 II 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 lllllll IMIIMI 

orf 64 - 1 AEVFAA I GAAAGTDKPVHVKYAAPDDAKI LLGKATVLPEDNGNGWMVI DD I TVL I HAQK 

430 440 450 460 470 480 



35 



490 500 510 520 530 540 

orf 64a. pep EAAWGEVAKRLAHE I RNPLTP I QLS AERLAWKLGGKLDEXDAQ I LTRSTDT 1 1 KQVAALK 

II IMIIIIII II llllll II II III llllll lllllll II MM II II I MM 1 1 II I 

orf 64 - 1 EAAWGEVAKRLAHE I RNPLTP I QLS AERLAWKLGGKLDEQDAQ I LTRSTDT I VKQVAALK 

490 500 510 520 530 540 



40 



550 560 570 580 590 600 

orf 64a . pep EMVEAFRNYXRSPSXQLENQDLNALIGDVLALYEAGPCRFAAELAGEPLMMAADTTAMRQ 

IMMIMI MM M 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 MMMIMI 

orf 64 - 1 EMVEAFRNYARS PS LKLENQDLNAL I GDVLALYEAGPCRFAAELAGEPLTVAADTTAMRQ 

550 560 570 580 590 600 



45 



610 620 630 640 650 660 

orf 64a . pep VLHN I FKNAAE AAEEADVPEVRVKS EAGQDGR I VLTVCDNGKGFGREMLHNAFE P YVTDK 

1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II M M I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 

orf 64 - 1 VLHN I FKNAAEAAEEADVPEVRVKSETGQDGR I VLTVCDNGKGFGREMLHNAFE P YVTDK 

610 620 630 640 650 660 



50 



670 680 690 700 

orf 64a . pep PAGTGLXLPWKKI I EEHGGX ISLSNQDAGGAXVRI ILPKTVETYAX 

MUM I E i 1 1 1 1 1 1 1 1 1 1 Ml Mill IMIMMMIIII 

orf 64 - 1 PAGTGLGLPWKKI IEEHGGRISLSNQDAGGACVRI I LPKTVKT YAX 

670 680 690 700 
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Homology with a predicted ORF from N. gonorrhoeae 

ORF64 (SEP ID NO: 250) shows 86.6% identity over a 387aa overlap with a predicted ORF 
(ORF64.ng) (SEP ID NO: 256) from N. gonorrhoeae: 



10 



15 



20 



orf 64 .pep 
orf 64ng 
orf 64 .pep 
orf 64ng 
orf 64 .pep 
orf 64ng 
orf 64 .pep 
orf 64ng 
orf 64 .pep 
orf 64ng 
orf 64 . pep 
orf 64ng 



MRRFLPIAAICAXXLXXGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK 

IIIIIIIIIIM I I M I i I M I I II I I II I I I h I I I I I I I I I I I I I i I I I I I I I I 
MRRFLPIAAICAWLLYGLTAATGSTSSLADYFWWIVSFSAMLLLVLSAVLARYVILLLK 



60 



60 



120 



DRRDGVFGSXXAKXPXXXMFTLVAXLPGVFLFGFPAQFINGTINSWFGNDTHEALERSLN 

Ilhlllll II llllll Ilhlllh I I I I i I I I I I I I I I M I I I I I I I I I 

DRRNGVFGSQIAKR-LSGMFTLVAVLPGLFLFGISAQFINGTINSWFGNDTHEALERSLN 119 

LS KS ALNLAADNALGNAVPVQ I DL I GAASLPGDMGRVLEHYAGSGFAQLALYNXASGKI E 180 

Mill Ml lll|::|l III IIIIIMII hll I I I I I I I I I I I I I II I llllll 

LS KS ALDLAADNAVSNAVPVQ I DL I GTASLSGNMGS VLEH YAGSGFAQLALYNAASGKI E 17 9 

KSINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHXGRDYALFFRQP 24 0 

IIIM-IIM I ^ I h 1 1 -1 1 1 h 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 lllllllllll 

KSINPHQFDQPLPDKEHWEQIQQTGSVRSLESIGGVLYAQGWLSAGTHNGRDYALFFRQP 23 9 

VPKGVAEDAVL I EKARAKYAELS YSKKGLQTFFLATLLI ASLLS I FLALVMALYFARRFV 300 
: I -I h I I I I I I I I I I I I I I I I I I I I I I I I I I h I I I I I I I I I I I I I I I I I II I I I I I 

I PENVAQDAVL I EKARAKYAELS YSKKGLQTF FLVTLL I ASLLS I FLALVMALYFARRFV 2 99 

EPVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTXLFNHMTEQLS I AKDADERNRRREEA 360 

I I : I 1 t I I I I I I I I i I I I I I I I I I I 1 I I I I I I I t I I I I I I 1 1 I I I 1 I = I I 1 I I I I t I ! I 

EPILSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLS I AKEADERNRRREEA 359 

394 



25 



ARHYLECVLEGLTTGVWFDEQGCLKTFNKAAGT 
llllll I I I : I I I I I 1 I I :| I 

ARHYLECVLDGLTTGVWSYPLSCCRTAVFSTCHSSPLSYF 400 



orf 64 .pep 
orf 64ng 

An PRF64ng nucleotide sequence [<SEQ ID 255>] (SEP ID NP: 255) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 256>] (SEP ID NP: 256) : 



30 



35 



i 

51 
101 
151 
201 
251 
301 
351 



MRRFLPIAAI 
LARYVILLL K 
TINSWFGNDT 
GNMGSVLEHY 
QQTGSVRSLE 
IEKARAKYAE 
PILSLAEGAK 
ERNRRREEAA 



CAWLLYGLT 
DRRNGVFGSQ 
HEALERS LNL 
AGSGFAQLAL 
SIGGVLYAQG 
LSYSKKGLQT 
AVAQGDFSQT 
RHYLECVLDG 



AATGSTSSLA 
IAKRLSGMFT 



DYFWWIVSFS 
LVAVLPGLFL 



SKSALDLAAD 
YNAASGKIEK 
WLSAGTHNGR 
F FLVTLL IAS 



NAVSNAVPVQ 
SINPHQFDQP 
DYALFFRQPI 
LLSIFLALVM 



RPVLRNDEFG 
LTTGVWSYP 



RLTKLFNHMT 
LSCCRTAVFS 



AMLLLVLSAV 
FGISAQFING 



IDLIGTASLS 
LPDKEHWEQI 
PENVAQDAVL ■ 
ALYFARRFVE 
EQLS I AKEAD 
TCHSSPLSYF* 



Further work revealed the complete gonococcal DNA sequence [<SEQ ID 257>] (SEP ID NP: 
257) : 



1 ATGCGCCGCT TCCTACCGAT CGCAGCCATA TGCGCCGTCG TCCTGCTGTA 

40 51 CGGATTGACG GCGGCGACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT 

101 GGTGGATAGT CTCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT 

151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCA ACGGCGTGTT 
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201 CGGTTCGCAG ATTGCCAAAC GCCTTTCCGG GATGTTCACG CTGGTCGCCG 

251 TACTGCCCGG CTTGTTCCTG TTCGGCATTT CCGCGCAGTT TATCAACGGC 

301 ACGATTAATT CGTGGTTCGG CAACGACACC CACGAAGCCC TCGAACGCAG 

351 CCTTAATTTG AGCAAGTCCG CACTGGATTT GGCGGCAGAC AATGCCGTCA 

5 401 GCAACGCCGT TCCCGTACAG ATAGACCTCA TCGGCACCGC CTCCCTGTCG 

451 GGCAATATGG GCAGTGTGCT GGAACACTAC GCCGGCAGCG GTTTTGCCCA 

501 GCTTGCCCTG TACAATGCCG CAAGCGGGAA AATCGAAAAA AGCATCAATC 

551 CGCACCAATT CGACCAGCCG CTTCCCGACA AAGAACATTG GGAACAGATT 

601 CAGCAGACCG GTTCGGTTCG GAGTTTGGAA AGCATAGGCG GCGTATTGTA 

10 651 CGCGCAGGGA TGGTTGTCGG CAGGTACGCA CAACGGGCGC GATTACGCGC 

701 TGTTCTTCCG CCAGCCGATT CCCGAAAATG TGGCACAGGA TGCCGTTCTG 

751 ATTGAAAAGG CGCGGGCGAA ATATGCCGAA TTGAGTTACA GCAAAAAAGG 

801 TTTGCAGACC TTTTTTCTGG TAACCCTGCT GATTGCCTCG CTGCTGTCGA 

851 TTTTTCTTGC GCTGGTAATG GCACTGTATT TTGCCCGCCG TTTCGTCGAA 

15 901 CCCATTCTGT CGCTTGCCGA GGGCGCAAAG GCGGTGGCGC AGGGTGATTT 

951 CAGCCAGACG CGCCCCGTAT TGCGCAACGA CGAGTTCGGA CGTTTGACCA 

1001 AGCTGTTCAA CCATATGACC GAGCAGCTTT CCATCGCCAA AGAAGCAGAC 

1051 GAACGCAACC GCCGGCGCGA GGAAGCCGCC CGTCACTACC TCGAGTGCGT 

1101 GTTGGATGGG TTGACTACCG GTGTGGTGGT GTTTGACGAA AAAGGCCGTT 

20 1151 TGAAAACCTT CAACAAGGCG GCGGAACAGA TTTTGGGGAT GCCGCTCGCC 

1201 CCCCTGTGGG GCAGCAGCCG GCACGGTTGG CACGGCGTTT CGGCGCAGCA 

1251 GTCCCTGCTT GCCGAAGTGT TtgccgccAT CGGTGCGGCG GCAGGTACGG 

1301 ACAAACCGGT CCAGGTGGAA TATGCCGCGC CGGACGATGC CAAAATCCTG 

1351 CTGGGCAAGG CGACGGTATT GCCCGAAGAC AACGGCAACG GCGTGGTGAT 

25 14 01 GGTGATTGAC GACATCACCG TGCTGATACG CGCGCAAAAA GAAGCCGCGT 

1451 GGGGTGAAGT GGCGAAGCGG CTGGCACACG AAATCCGCAA TCCGCTCACG 

1501 CCCATCCAGC TTTCCGCCGA ACGGCTGGCG TGGAAATTGG GCGGGAAGCT 

1551 GGACGATCAG GACGCGCAAA TCCTGACGCG TtcgACCGAC ACCATCATCA 

1601 AACAGgtggc gGCGTTAAAA GAAATGGTCG AGGCATTCCG CAATTACGCG 

30 1651 CGCGCCCCTT CGCTCAAACT GGAAAATCAG GATTTGAACG CCTTAATCGG 

1701 CGATGTTTTG GCCCTGTACG AAGCCGGCCC GTGCCGGTTT GAGGCGGAAC 

1751 TTGCCGGCGA ACCGCTGATG ATGGCGGCGG ATACGACCGC CATGCGGCAG 

1801 GTGCTGCACA ATATTTTCAA AAATGCCGCC GAAGCGGCGG AAGAAGCCGA 

1851 TATGCCCGAA GTCAGGGTAA AATCGGAAAC GGGGCAGGAC GGACGGATTG 

35 1901 TCCTGACGGT TTGCGACAAC GGCAAGGGAT TCGGCAAGGA AATGCTGCAC 

1951 AATGCTTTCG AGCCGTATGT GACGGATAAG CCGGCGGGAA CGGGACTGGG 

2001 TCTGCCTGTA GTGAAAAAAA TCATTGGAGA ACACGGCGGC CGCATCAGCC 

2 051 TGAGCAATCA GGATGCGGGT GGGGCGTGTG TCAGAATCAT CTTGCCAAAA 

2101 ACGGTAGAAA CTTATGCGTA G 



40 



This corresponds to the amino acid sequence [<SEQ ID 258; ORF64ng-l>] (SEP ID NO: 258; 
ORF64ng-l) : 



1 MRRFLPIAAI CAWLLYGLT AATGSTSSLA DYFWWIVSFS AMLLLVLSAV 

51 LARYVILLL K DRRNGVFGSQ IAKRLS GMFT LVAVLPGLFL FGI SAQFING 

45 101 TINSWFGNDT HEALERS LNL SKSALDLAAD NAVSNAVPVQ IDLIGTASLS 

151 GNMGSVLEHY AGSGFAQLAL YNAASGKIEK SINPHQFDQP LPDKEHWEQI 

201 QQTGSVRSLE SIGGVLYAQG WLSAGTHNGR DYALFFRQPI PENVAQDAVL 

251 IEKARAKYAE LSYSKKGLQT FFLVT LLIAS LLS I FLALVM AL YFARRFVE 

301 PILSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNHMT EQLS I AKEAD 

50 351 ERNRRREEAA RHYLECVLDG LTTGVWFDE KGRLKTFNKA AEQILGMPLA 

401 PLWGSSRHGW HGVSAQQSLL AEVFAAIGAA AGTDKPVQVE YAAPDDAKIL 

4 51 LGKATVLPED NGNGWMVID DITVLIRAQK EAAWGEVAKR LAHEIRNPLT 

501 PIQLSAERLA WKLGGKLDDQ DAQILTRSTD TIIKQVAALK EMVEAFRNYA 

551 RAPSLKLENQ DLNALIGDVL ALYEAGPCRF EAELAGEPLM MAADTTAMRQ 

55 601 VLHNIFKNAA EAAEEADMPE VRVKSETGQD GRIVLTVCDN GKGFGKEMLH 

651 NAFEPYVTDK PAGTGLGLPV VKKIIGEHGG RISLSNQDAG GACVRIILPK 

701 TVETYA* 
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ORF64ng-l (SEP ID NO: 258) and ORF64-1 (SEP ID NO: 252) show 93.8% identity in 706 aa 
overlap: 



orf 64ng-l .pep 



orf 64-1 



10 20 30 40 50 60 

MRRFLPIAAICAWLLYGLTAATGSTSSLADYFWWIVSFSAMLLLVLSAVLARYVILLLK 

1 1 M I M 1 1 1 1 1 1 1 1 1 II II 1 1 II 1 1 1 III I II I M I M M I M I Ml M 1 1 1 M I II 

MRRFLPIAAICAWLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK 
10 20 30 40 50 60 



10 



orf 64ng-l .pep 



orf 64-1 



70 80 90 100 110 120 

DRRNGVFGSQIAKRLSGMFTLVAVLPGLFLFGISAQFINGTINSWFGNDTHEALERSLNL 

I I I : M II I I I I 1 1 ! I I I I I I II I I I I : M 1 I = M I I I 1 I I I I I I I I I I M I I I 

DRRDGVFGSQIAKRLSGMFTLVAVLPGVFLFGVSAQFINGTINSWFGNDTHEALERSLNL 
70 80 90 100 110 120 



15 



orf 64ng-l . pep 



orf 64-1 



130 140 150 160 170 180 

SKSALDLAADNAVSNAVPVQIDLIGTASLSGNMGSVLEHYAGSGFAQLALYNAASGKIEK 

I I I I :| I I I Ih : I I I I I I I M I M I Ml | | | M II I I I I I I I I I I I I I I I I I I 
S KS ALNLAADNALGNA VP VQ I DL I G AAS L PGDMGR VLEH Y AGS GF AQLAL YNAASGK I E K 
130 140 150 160 170 180 



20 



orf 64ng-l .pep 



orf64-l 



190 200 210 220 230 240 

S INPHQFDQPLPDKEHWEQ IQQTGS VRSLES I GGVL YAQGWLSAGTHNGRD YALFFRQP I 

MIIMMIM I MMMMIMIMMIMI MUM MIIIMMIIMM 
SINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHNGRDYALFFRQPV 
190 200 210 220 230 240 



25 



orf 64ng-l . pep 



orf 64-1 



250 260 270 280 290 300 

PENVAQDAVLIEKARAKYAELSYSKKGLQTFFLVTLLIASLLSIFLALVMALYFARRFVE 

M Ml I I I I I I I II I I I II I M I I I II I I Ml M I I II II I I I I II I I II II I M 
PKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFVE 
250 260 270 280 290 300 



30 



orf 64ng-l . pep 



orf 64-1 



310 320 330 340 350 360 

P I LSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLS I AKEADERNRRREEAA 

Mlllllllllll lllllll IMIIMIII Mil IIIIMilMIIIIIII llllllll 

PVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLS I AKEADERNRRREEAA 
310 320 330 340 350 360 



35 



orf 64ng-l .pep 



orf 64-1 



370 380 390 400 410 420 

RHYLECVLDGLTTGVWFDEKGRLKTFNKAAEQILGMPLAPLWGSSRHGWHGVSAQQSLL 

IIMIIMIIIIII MM I II I M I II II 1 1 M M I II 1 1 II 1 1 1 1 M 1 1 

RHYLECVLEGLTTGVWFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSLL 
370 380 390 400 410 420 



40 



orf 64ng-l .pep 



orf64-l 



430 440 450 460 470 480 

AEVFAAIGAAAGTDKPVQVEYAAPDDAKILLGKATVLPEDNGNGWMVIDDITVLIRAQK 

M M M I II M 1 1 1 II MMI II 1 1 1 1 1 M M 1 1 M I III 1 1 1 1 1 1 1 II M II I II M 1 1 

AEVFAAIGAAAGTDKPVHVKYAAPDDAKILLGKATVLPEDNGNGWMVIDDITVLIHAQK 
430 440 450 460 470 480 



45 



orf 64ng-l .pep 



orf 64-1 



490 500 510 520 530 540 

EAAWGEVAKRLAHE I RNPLTP IQLS AERLAWKLGGKLDDQDAQ I LTRSTDT I I KQVAALK 

II I II II 1 1 1 1 1 1 1 II 1 1 II I M 1 1 1 1 1 1 1 1 1 1 M M II 1 1 1 1 1 1 M Ml II 1 1 1 

EAAWGEVAKRLAHE I RNPLTP I QLS AERLAWKLGGKLDEQDAQ I LTRSTDT I VKQVAALK 
490 500 510 520 530 540 
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550 560 570 580 590 600 

orf 64ng- 1 . pep EIWEAFRNYARAPSLKLENQDLNALIGDVLALYEAGPCRFEAELAGEPLMMAADTTAMRQ 

ii ii i ii nihil ii i ii ii Mill 1 1 ii i ii ii ii i ii i ii i ii ii ^ ii hum 

orf 64 - 1 EMVEAFRNYARSPSLKLENQDLNALIGDVLALYEAGPCRFAAELAGEPLTVAADTTAMRQ 
5 550 560 570 580 590 600 

610 620 630 640 650 660 

orf 64ng- 1 . pep VLHNIFKNAAEAAEEADMPEVRVKSETGQDGRIVLTVCDNGKGFGKEMLHNAFEPYVTDK 
I I I I I I I II II I I I I I I : I I I I I I I I I I I I I I I I I I I I II I I I I : I I I I I I I I I I ! I I 
or f 64 - 1 VLHNI FKNAAEAAEEADVPEVRVKSETGQDGRI VLTVCDNGKGFGREMLHNAFEPYVTDK 

10 610 620 630 640 650 660 

670 680 690 700 

orf 64ng-l .pep PAGTGLGLPWKKI IGEHGGR I SLSNQDAGGACVRI I LPKTVETYAX 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 lllllll llllll llllll IIIM I 

orf 64-1 PAGTGLGLPWKKI I E EHGGR I SLSNQDAGGACVRI I LPKTVKTY AX 

15 670 680 690 700 

Furthermore, ORF64ng-l (SEP ID NO: 258) shows significant homology to a protein (SEP ID 
NP: 1129) from A.caulinodans: 

sp | Q0 4 8 5 0 | NTRY_AZOCA NITROGEN REGULATION PROTEIN NTRY ) gi | 77479 | pir | | S18624 ntrY 
20 protein - Azorhizobium caulinodans ) gi | 38737 (X63841) NtrY gene product 

[Azorhizobium caulinodans] Length = 771 
Score = 218 bits (550) , Expect = 7e-56 

Identities. = 195/720 (27%), Positives = 320/720 (44%), Gaps = 58/720 (8%) 

Query: 7 IAAICAWLLYGLTAATGSTSSLADYFWWIXXXXXXXXXXXXXXXXRYVILLLKDRRNGV 66 
25 I+A+ ++L GLT + + + R++KRG 

Sbjct: 35 ISALATFLILMGLTPWPTHQWIS VLLVNAAAVL I LSAMVGRE I WRI AKARARGR 90 

Query: 67 FGSQIAKRLSGMFTLVAVLPGLFLFGISAQFINGTINSWFGNDTHEALERSLNLSKSALD 126 

+++ R+ G+F +V+V+P + + +++ ++ ++ WF T E + S++++++ + 
Sbjct: 91 AAARLHIRIVGLFAWSWPAILVAWASLTLDRGLDRWFSMRTQEIVASSVSVAQTYVR 150 

30 Query: 127 LAADNAVSNAVPVQIDLIGTASLSGNMGSVLEHYAG- -SGFAQLALYNAASGKIEKSINP 184 

AN+ + +DL S+ YGSFQ+ AA++ + 

Sbjct: 151 EHALN I RGD I LAMS ADLTRLKS V YEGDRSRFNQILTAQAALRNLPGAMLI 200 

Query: 185 HQFDQPLPDKEHWEQIQQTGSVRSLESIGGVLYAQGWLSAGTHNGRDYA 233 

+ D++++ 1+ V + +IG Q + N DY 

35 Sbjct: 201 RR-DLSWERAN- VNIGREFI VPANLAIGDATPDQPVTYLP- -NDADYVAAWPLKDYDD 256 

Query: 234 - - LFFRQP I PENVAQDAVLI EKARAKYAELS YSKKGLQTFFLVTXXXXXXXXXXXXXVMA 291 

L++I V ++AYL + G+Q F + + 

Sbjct: 257 LYLYVARL I DPRV I GYLKTTQETLAD YRSLEERRFGVQVAFALMYAVI TL I VLLSAVWLG 316 

Query: 292 LYFARRFVEPILSLAEGAKAVAQGDFSQTRPVLRND- EFGRLTKLFNHMTEQLS IXXXXX 350 
40 L F++ V PI L A VA+G+ P+ R + + L + FN MT +L 

Sbjct: 317 LNFSKWLVAPIRRLMSAADHVAEGNLDVRVPIYRAEGDLASLAETFNKMTHELRSQREAI 376 

Query: 351 XXXXXXXXXXXHYLECVLDGLTTGVWFDEKGRLKTFNKAAEQILGMPLAPLWGSSRHGW 410 

+ E VL G+ GV+ D + R+ N++AE++LG L+ + RH 
Sbjct: 377 LTARDQIDSRRRFTEAVLSGVGAGVIGLDSQERITILNRSAERLLG- - LSEVEALHRHLA 434 



45 



Query: 



411 HGVSAQQSLLAEVFXXXXXXXXTDKPVQVEYAAPDDAKILLGKATVLPEDNG NGWM 467 

V LL E + VQ D + + V E + +G V+ 



CHIR-0160 (356.001) 



-242- 



PATENT 



Sbjct: 435 EWPETAGLLEEA EHARQRSVQGNITLTRDGRERVFAVRVTTEQSPEAEHGWW 488 

Query: 468 VIDD I TVL I RAQKEAAWGEVAKRLAHE I RNPLTP I QLS AERLAWKLGGKLDDQDAQ I LTR 527 

+DDIT LI AQ+ +AW +VA+R+AHEI+NPLTPIQLSAERL KG + QD +1 + 

Sbjct: 489 TLDD I TEL I S AQRTS AWADVARR I AHE I KNPLTP IQLS AERLKRKFGRHV - TQDRE I FDQ 547 

5 Query: 528 STDT I I KQVAALKEMVEAFRNYARAPSLKLENQDLNALIGDVLALYEAGPCRFEAELAGE 587 

TDTII+QV + MV+ F ++AR P +++QD++ +1 + L G + 

Sbjct: 548 CTDT 1 1 RQVGD I GRMVDE FS S FARMPKP WDSQDMS E 1 1 RQTVFLMRVGHPEWFDSEVP 607 

Query: 588 PLMMAA- DTTAMRQVLHNI FKNXXXXXXXXDMPEVRVK S ETGQDGR I VLTVCD 639 

PMA D +QLNIKN P+VR + + G+D +V+ + D 

10 Sbjct: 608 PAMPARFDRRLVSQALTNILKNAAEAIEAVP- PDVRGQGRIRVSANRVGED- -LVIDIID 664 

Query: 640 NGKGFGKEMLHNAFEPYVTDKPAGTGLGLPWKKI IGEHGGRISLSNQDAG-GACVRI IL 698 

NG G +E + EPYVT + GTGLGL +V KI+ EHGG I L++ G GA +R+ L 

Sbjct: 665 NGTGLPQESRNRLLEPYVTTREKGTGLGLAIVGKIMEEHGGGIELNDAPEGRGAWIRLTL 724 



Based on this analysis, including the presence of a putative leader sequence (double-underlined) 
15 and several putative transmembrane domains (single-underlined) in the gonococcal protein, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 31 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 259>] (SEP ID 
20 NO: 259) : 



1 ATGTACGCAT TTACCGCCGC ACAGCAACAG AAGGCACTCT TCCGGCTGGT 

51 GCTTTTTCAT ATCCTCATCA TCGCCGCCAG CAACTATCTG GTGCAGTTCC 

101 CTTTCCAAAT TTTCGGCATC CACACCACTT GGGGCGCATT TTCCTTTCCC 

151 TTCATCTTCC TTGCCACCGA CCTGACCGTC CGCATTTTCG GTTCTCACTT 

25 201 GGCACGGCGG ATTATCTTTT GGGTGATGTT CCCCGCCCTT TTGCTTTCCT 

251 ACGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACAGG CTTGGGCGCG 

301 CTGTCCGAAT TCAACACCTT TGTCGGACGC ATCGCCTTAG CCAGCTTTGC 

351 CGCCTACGCG ATCGGACAAA TCCTTGATAT TTTTGTATTC AACAAATTAC 

4 01 GCCGTCTGAA AGCGTGGTGG ATTGCACCGA ACGCATCAAC CGTCATCGGG 

30 451 CACGCGTTGG ATACG . . . 



This corresponds to the amino acid sequence [<SEQ ID 260; ORF66>] fSEO ID NO: 260; 
PRF66) : 



1 MYAFTAAQQQ KALFRLVLFH ILIIAASNYL VQFPFQIFGI HTTWGAFSFP 
35 51 FIFLATDLTV RIFGSHLARR IIFWVMFPAL LLSYVFSVLF HNGSWTGLGA 

101 LSEFNTFVGR I ALAS FAAYA IGQILDIFVF NKLRRLKAWW IAPNASTVIG 
151 HALDT . . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 26 1>] (SEP ID NO: 261) : 
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1 ATGTACGCAT TTACCGCCGC ACAGCAACAG AAGGCACTCT TCCGGCTGGT 

51 GCTTTTTCAT ATCCTCATCA TCGCCGCCAG CAACTATCTG GTGCAGTTCC 

101 CTTTCCAAAT TTTCGGCATC CACACCACTT GGGGCGCATT TTCCTTTCCC 

151 TTCATCTTCC TTGCCACCGA CCTGACCGTC CGCATTTTCG GTTCTCACTT 

201 GGCACGGCGG ATTATCTTTT GGGTGATGTT CCCCGCCCTT TTGCTTTCCT 

251 ACGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACAGG CTTGGGCGCG 

3 01 CTGTCCGAAT TCAACACCTT TGTCGGACGC ATCGCCTTAG CCAGCTTTGC 

3 51 CGCCTACGCG ATCGGACAAA TCCTTGATAT TTTTGTATTC AACAAATTAC 

4 01 GCCGTCTGAA AGCGTGGTGG ATTGCACCGA CCGCATCAAC CGTCATCGGC 
4 51 AACGCCTTGG ATACGCTGGT ATTTTTCGCC GTTGCCTTCT ACGCAAGCAG 
501 CGATGGATTT ATGGCGGCAA ACTGGCAGGG CATCGCTTTT GTCGATTACC 
551 TGTTCAAACT TACCGTCTGC ACCCTCTTCT TCCTGCCCGC CTACGGCGTG 
601 ATACTGAATC TGCTGACGAA AAAACTGACA ACCCTGCAAA CCAAACAGGC 
651 GCAAGACCGC CCCGCGCCCT CGCTGCAAAA TCCGTAA 

This corresponds to the amino acid sequence [<SEQ ID 262; ORF66-l>] (SEP ID NO: 262; 
PRF66-1) : 



1 MYAFTAAQQQ KALFRLVLFH ILIIAASNYL VQFPFQIFGI HTTWGAFSFP 

51 FIFLATDLTV RIFGSHLARR IIFWVMFPAL LLSYVFS VLF HNGSWTGLGA 

101 LSEFNTFVGR I ALASFAAYA IGQILDIFV F NKLRRLKAWW IAPTAS TVIG 

151 NALDTLVFFA VAF YASSDGF MAANWQGIAF VDYLFKLT VC TLFFLPAYGV 

201 I LNLLTKKLT TLQTKQAQDR PAPSLQNP* 



Computer analysis of this amino acid sequence gave the following results: 

Homology with the hypothetical protein o221 (SEP ID NO: 1130) of E. coli (accession number 
P37619) 



ORF66 (SEP ID NO: 260) and o221 protein (SEP ID NO: 1130) show 67% aa identity in 155aa 
overlap: 



orf 66 


1 


MYAFTAAQQQKALFRLVLFHILI IAASNYLVQFPFQ I FG I HTTWGAFSFP F I FLATDLTV 


60 






M F+ Q+ KALF L LFH+L+I +SNYLVQ PIG HTTWGAFSFP FIFLATDLTV 




o221 


1 


MNVFS QTQRY KAL FWLSLFHLLV I TSSNYLVQLPVS I LGFHTTWGAFSFP FIFLATDLTV 


60 


orf 66 


61 


RI FGSHLARRI I FWVMFPALLLS YVFSVLFHNGSWTGLGALSEFNTFVGR IALASFAAYA 


120 






RIFG+ LARRIIF VM PALL+SYV S LF+ GSW G GAL+ FN FV RIA ASF AYA 




o221 


61 


RI FGAPLARRI I FAVM I PALL I S YVI SS LF YMGSWQGFGALAHFNLFVARI ATAS FMAYA 


120 


orf66 


121 


IGQ I LD I FVFNKLRRLKAWWI APNAS TV IGHALDT 155 








+GQILD+ VFN+LR+ + WW+AP AST+ G+ DT 




o221 


121 


LGQILDVHVFNRLRQSRRWWLAPTASTLFGNVSDT 155 





Homology with a predicted PRF from Kmeninsitidis (strain A) 



PRF66 (SEP ID NP: 260) shows 96.1% identity over a 155aa overlap with an PRF (PRF66a) 
(SEP ID NP: 264) from strain A of N. meningitidis: 
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10 20 30 40 50 60 

orf 66 . pep MYAFTAAQQQKALFRLVLFH I L 1 1 AASNYLVQFPFQI FG I HTTWGAFS FPF I FLATDLTV 

llllllllllllll 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 IIMIMIII III llllll 

orf 66a MYAFTAAQQQKALFWLVLFHILI I AASNYLVQFPFQI SGIHTTWGAFSFPF I FLATDLTV 

10 20 30 40 50 60 



10 



70 80 90 100 110 120 

orf 66 . pep RI FGSHLARR I I FWVMFPALLLSYVFSV LFHNGSWTGLGALSEFNTFVGR I ALASFAAYA 

1 1 m 1 1 1 1 1 m i i Minimi 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 ii 1 1 1 1 1 1 1 

orf 66a RI FGSHLARR I I FWVMFPALLLS YVFS VLFHNGSWTGLGALSEFNTFVGR I ALASFAAYA 

70 80 90 100 110 120 



15 



130 140 150 

orf 66 .pep I GQ I LP I FV FNKLRRLKAWW I APNAS TV I GHALDT 

:|||IIIIIIIIM I I :|hlllllMMI 
orf 66a LGQ I LP I FV FNKLRRLKAWWVAPTAS TV I GNALDTLVFFAVAF YAS S DGFMAANWQG I AF 

130 140 150 160 170 180 



orf 66a VDYLFKLT VCGLFFLPAYGVILNLL TKKLTTLQTKQAQDRPAPSLQNPX 

190 200 210 220 

The complete length ORF66a nucleotide sequence [<SEQ ID 263>] (SEP ID NO: 263) is: 



20 



25 



30 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 



ATGTACGCAT 
GCTTTTTCAT 
CCTTCCAAAT 
TTCATCTTCC 
GGCACGGCGG 
ACGTCTTTTC 
CTGTCCGAAT 
CGCCTACGCG 
GCCGTCTGAA 
AACGCCTTAG 
CGATGGATTT 
TGTTCAAACT 
ATTCTGAATC 
GCAAGACCGC 



TTACCGCCGC 
ATCCTCATCA 
TTCCGGCATC 
TCGCCACCGA 
ATTATCTTTT 
CGTTTTGTTC 
TCAACACCTT 
CTCGGACAAA 
AGCGTGGTGG 
ATACGTTGGT 
ATGGCGGCAA 
CACCGTCTGC 
TGCTGACGAA 
CCCGCGCCCT 



ACAGCAACAG 
TCGCCGCCAG 
CACACCACTT 
CCTGACCGTC 
GGGTCATGTT 
CACAACGGCA 
TGTCGGACGC 
TCCTTGATAT 
GTTGCCCCGA 
ATTTTTCGCC 
ACTGGCAGGG 
GGTCTGTTTT 
AAAACTGACG 
CGCTGCAAAA 



AAGGCACTCT 
CAACTATCTG 
GGGGCGCGTT 
CGCATTTTCG 
CCCCGCCCTT 
GTTGGACGGG 
ATCGCGCTGG 
TTTTGTGTTC 
CTGCATCAAC 
GTTGCCTTCT 
CATCGCTTTT 
TCCTGCCCGC 
ACCCTGCAAA 
TCCGTAA 



TCTGGCTGGT 
GTGCAGTTCC 
TTCCTTTCCC 
GTTCGCACTT 
TTGCTTTCCT 
CTTGGGCGCG 
CAAGTTTTGC 
AACAAATTAC 
CGTCATCGGC 
ACGCAAGCAG 
GTCGATTACC 
CTACGGCGTG 
CCAAACAGGC 



35 This encodes a protein having amino acid sequence [<SEQ ID 264>] (SEP ID NO: 264) : 



1 MYAFTAAQQQ KALFWLVLFH ILIIAASNYL VQFPFQISGI HTTWGAFS FP 

51 F I FLATDLTV RIFGSHLARR IIFWVMFPAL LLSYVFS VLF HNGSWTGLGA 

101 LSEFNTFVGR I ALAS FAAYA LGQILDIFV F NKLRRLKAWW VAPTAS TVIG 

151 NALDTLVFFA VAF YASSDGF MAANWQGIAF VDYLFKLT VC GLFFLPAYGV 

40 201 I LNLL TKKLT TLQTKQAQDR PAPSLQNP* 

ORF66a (SEP ID NO: 264) and PRF66-1 (SEP ID NP: 262) show 97.8% identity in 228 aa 
overlap: 



10 20 30 40 50 60 

45 orf 66a. pep MYAFTAAQQQKALFWLVLFHILI I AASNYLVQFPFQI SGI HTTWGAFSFPFI FLATDLTV 

llllllllllllll 1 1 1 1 1 1 Ml I' I M 1 1 1 1 1 1 1 1 I M 1 1 1 1 M I M 1 1 M 1 1 1 1 1 

orf 66-1 MYAFTAAQQQKALFRLVLFHILI I AASNYLVQFPFQIFGI HTTWGAFSFPFI FLATDLTV 

10 20 30 40 50 60 



70 80 90 100 110 120 
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orf66a.pep RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALAS FAAYA 

I M 1 1 1 1 1 1 1 1 M I M 1 1 1 1 1 M M 1 1 M II I i I II 1 1 1 1 II 1 1 1 1 1 1 1 II II I II 1 1 

orf 66-1 RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 66a . pep LGQILDIFVFNKLRRLKAWWVAPTASTVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF 
:| I I I I I I I I I I , I I I I I I h I I II I I I I I I I I M I I I I I I I I M I I I I I II I I I I I I I 
orf 66-1 IGQILDIFVFNKLRRLKAWWIAPTASTVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF 

130 140 150 160 170 180 



10 



190 200 210 220 229 

orf 66a . pep VD YL FKLTVCGL F FL P AYGV I LNLLTKKLTTLQT KQAQDRP APSLQNPX 

Illlllllll I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
orf 66-1 VDYLFKLTVCTLFFLPAYGVILNLLTKKLTTLQTKQAQDRPAPSLQNPX 

190 200 210 220 



15 Homoloev with a predicted ORF from N. gonorrhoeae 

ORF66 (SEP ID NO: 260) shows 94.2% identity over a 155aa overlap with a predicted ORF 
(ORF66.ng) (SEP ID NO: 266) from N. gonorrhoeae: 

orf 66. pep MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV 60 
I I :| I I I I I I I I I I I I I I I I I I I I I I I I I I I h' I I I I I I I I I I I I ' I I I I I I I I I I 
20 orf66ng MYALTAAQQQKALFRLVLFHILIIAASNYLVQFPFRIFGIHTTWGAFSFPFIFLATDLTV 60 

orf 66 .pep RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA 120 

I I I I I I I I I , I I I I I I I I IMIIIIIIIIMI III I: I I I I I I I I I I I I I I I I 
orf66ng RIFGSHLARRIIFWVMFPALSLSYVFSVLFHNGSWTGLGAPSQFNTFVGRIALASFAAYA 120 

orf 66 . pep I GQ I LD I FVFNKLRRLKAWW I APNASTV I GHALDT 155 

25 : I I I I I I I I I : I I I I I I I I I I I I I I I I I h I I II 

orf 66ng LGQILDIFVFDKLRRLKAWWIAPAASTVIGNALDTLVFFAVAFYASSDEFMAANWQGIAF 180 

The complete length PRF66ng nucleotide sequence [<SEQ ID 265>] (SEP ID NP: 265) is: 



30 



35 



40 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 



ATGTACGCAT 
GCTTTTCCAT 
CCTTCCGGAT 
TTCATCTTCC 
GGCGCGGCGG 
aCGTCTTTTC 
CtgTCCCAAT 
CGCCTACGCG 
GCCGTCTGAA 
AATGCACTGG 
CGATGAATTT 
TGTTCAAACT 
ATACTGAATC 
GCAAGACCGC 



TGACCGCCGC 
ATCCTCATCA 
TTTCGGCATC 
TCGCCACCGA 
ATTATCTTTT 
CGTTTTGTTC 
TCAACACCTT 
CTCGGACAAA 
AGCGTGGTGG 
ACACGTTAGT 
ATGGCGGCAA 
TACCGTCTGC 
TGCTGACGAA 
CCCGTGCCCT 



ACAGCAACAG 
TCGCCGCCAG 
CACACCACTT 
CCTGACCGTC 
GGGTGATGTT 
CACAACGGCA 
TGTCGGACGC 
TCCTTGATAT 
ATTGCCCCGG 
ATTTTTTGCC 
ACTGGCAGGG 
ACCCTCTTCT 
AAAACTGACG 
CGCTGCAAAA 



AAGGCACTCT 
CAACTATCTG 
GGGGCGCGTT 
CGCATTTTCG 
CCCCGCCCTT 
GTTGGACGGG 
ATCGCGCTGG 
TTTCGTATTC 
CCGCATCAAC 
GTTGCCTTTT 
CATCGCTTTT 
TCCTGCCCGC 
GCCCTGCAAA 
TCCGTAA 



TCCGGCTGGT 
GTGCAGTTCC 
TTCCTTTCCC 
GTTCGCACTT 
ttgCTTTcat 
CTTGGGCGCG 
CAAGTTTTGC 
GACAAATTAC 
CGTCATCGGC 
ACGCAAGCAG 
GTCGATTACC 
CTACGGCGTG 
CCAAACAGGC 



This encodes a protein having amino acid sequence [<SEQ ED 266>] (SEP ID NP: 266) : 



45 



1 MYALTAAQQQ KALFRLVLFH ILIIAASNYL VQFPFRIFGI HTTW GAFSFF 
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51 FIFLATDLTV R I FGSHLARR IIFWVMFPAL SLSYVFSVLF HNGSWTGLGA 
101 PSQ FNTFVGR IALASFAAYA LGQILDIFVF DKLRRLKAWW IAPAA STVIG 
151 NALDTLVFFA VA FYASSDEF MAANWQGIA F VDYLFKLTVC T LFFLPAYGV 
201 ILNLLTKKLT ALQTKQAQDR PVPSLQNP* 

An alternative annotated sequence is: 



10 



1 MYALTAAQQQ KALFRLVLFH 

51 FIFLATDLTV RIFGSHLARR 

101 LSQFNTFVGR I ALASFAAYA 

151 NALDTLVFFA VAF YASSDEF 

201 ILNLLTKKLT ALQTKQAQDR 



ILIIAASNYL VQFPFRIFGI HTTWGAFSFP 
IIFWVMFPAL LLSYVFS VLF HNGSWTGLGA 
LGQILDIFVF DKLRRLKAWW IAPAAS TVIG 



MAANWQGIAF VDYLFKLT VC TLFFLPAYGV 
PVPSLQNP* 



ORF66ng (SEP ID NO: 266) and ORF66-1 (SEP ID NO: 262) show 96.1% identity in 228 aa 
overlap: 



15 



20 



25 



orf 66-1 .pep 



orf 66ng 



orf 66- 1 .pep 
orf 66ng 
orf 66-1 .pep 
orf 66ng 
orf 66-1 .pep 
orf 66ng 



MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV 

IIMMIIIIIIIII llllllllll MINIMUM! llllllllll MINI 
MYALTAAQQQKALFRLVLFHILIIAASNYLVQFPFRIFGIHTTWGAFSFPFIFLATDLTV 



60 



60 



RI FGSHLARRI I FWVM F PAL LLSYVFSVLFHNGSWTGLGALSEFNTFVGR IALASFAAYA 12 0 

I II I II 1 1 1 M II I M II 1 1 1 1 M 1 1 M II 1 1 1 1 1 1 II II I M 1 1 1 1 1 1 1 1 1 M 1 1 II 

R I FGSHLARR 1 1 FWVMFPALLLSYVFSVLFHNGSWTGLGALSQFNTFVGR IALASFAAYA 12 0 
IGQILDIFVFNKLRRLKAWWIAPTASTVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF 180 

Ml IMMNMIM IMMNMIMMMMMMIMMMM IIIIIIIIMI 

LGQ I LD I FVFDKLRRLKAWW I APAAS TV I GNALDTLVFFAVAF YAS S DE FMAANWQG I AF 180 



VDYLFKLTVCTLFFLPAYGVILNLLTKKLTTLQTKQAQDRPAPSLQNPX 

1 1 1 M 1 1 II 1 1 M 1 1 1 1 1 1 1 1 I II I M I Ml 11 M 1 1 1 MM 1 1 1 1 M 

VDYLFKLTVCTLFFLPAYGVILNLLTKKLTALQTKQAQDRPVPSLQNPX 



229 



229 



Furthermore, PRF66ng (SEP ID NP: 266) shows significant homology with an E.coli PRF (SEP 
IDNP: 1130): 



sp|P37619|YHHQ_ECOLI HYPOTHETICAL 25.3 KD PROTEIN IN FTSY-NIKA INTERGENIC REGION 
(0221) 

)gi 1 1073495 |pir | |S47690 hypothetical protein o221 - Escherichia coli )gi|466607 
(U00039) No definition line found [Escherichia coli] )gi| 1789882 (AE000423) 
hypothetical 25.3 kD protein in ftsY-nikA intergenic region [Escherichia coli] 
Length = 221 
Score = 273 bits (692), Expect = 5e-73 
Identities = 132/203 (65%), Positives = 155/203 (76%) 

Query: 1 MYALTAAQQQKALFRLVLFHILIIAASNYLVQFPFRIFGIHTTWGAFSFPFI FLATDLTV 60 
M + Q+ KALF L LFH+L+I +SNYLVQ PIG HTTWGAFS FPF I FLATDLTV 
40 Sbjct: 1 MNVFSQTQRYKALFWLSLFHLLVITSSNYLVQLPVS I LGFHTTWGAFS FPF I FLATDLTV 60 

Query: 61 R I FGSHLARR I I FWVMFPALLLSYVFSVLFHNGSWTGLGALSQFNTFVGR IALASFAAYA 120 

RIFG+ LARRIIF VM PALL+SYV S LF+ GSW G GAL+ FN FV RIA ASF AYA 
Sbjct: 61 R I FGAPLARR 1 1 FA VM I PALL I S YV I S S L F YMGS WQGFG ALAH FNL FVAR I AT AS FMAY A 120 



30 



35 



45 



Query: 121 LGQ I LD I FVFDKLRRLKAWW I APAAS TV I GNALDTLVFFAVAF YAS SDE FMAANWQG I AF 180 
LGQILD+ VF++LR+ + WW+AP AST+ GN DTL FF +AF+ S D FMA +W IA 
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Sbjct: 121 LGQILDVHVFNRLRQSRRWWLAPTASTLFGNVSDTLAFFFIAFWRS PDAFMAEHVJMEIAL 180 

Query: 181 VDYLFKLTVCTLFFLPAYGVILN 203 

VDY FK+ + + FFLP YGV+LN 
Sbjct: 181 VDYCFKVL I S I VFFLPMYGVLLN 203 

Based on this analysis, including the homology with the Exoli protein and the presence of several 
putative transmembrane domains in the gonococcal protein, it is predicted that these proteins from 
N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or 
diagnostics, or for raising antibodies. 

Example 32 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 267>] (SEP ID 



1 ATGGTCATAA AATATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC 

51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAAyGCA GTmwrAATAT 

101 CTGAAACTGT TTCAGTTGAT ACCGGACAAG GTGCGAAAAT TCATAAGTTT 

151 GTACCTAAAA ATAGTAAAAC TTATTCATCT GATTTAATAA AAACGGTAGA 

201 TTTAACACAC AyyCCTACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA 

251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGG CGGGGGTCGG CAAACTTGCC 

301 CGCTTAGgCG CGAAATTCAG CACAAGGGCG GTtCCCTATG TCGGAACAGC 

351 CcTTTTAGCC CACGACGTAT ACGAAAcTTT CAAAGAAGAC ATACAGGCAC 

4 01 GAGGCTACCA ATACGACCCC GAAACCGACA AATTTGTAAA AGGCTACGAA 

4 51 TATAGTAATT GCCTTTGGTA CGAAGACAAA AGACGTATTA ATAGAACCTA 

501 TGGCTGCTAC GGCGTTGAT . . 



This corresponds to the amino acid sequence [<SEQ ID 268; ORF72>] (SEP ID NO: 268; 



Further work revealed the complete nucleotide sequence [<SEQ ED 269>] (SEP ID NP: 269) : 



1 ATGGTCATAA AATATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC 

51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAATGCA GTAAAAATAT 

101 CTGAAACTGT TTCAGTTGAT ACCGGACAAG GTGCGAAAAT TCATAAGTTT 

151 GTACCTAAAA ATAGTAAAAC TTATTCATCT GATTTAATAA AAACGGTAGA 

201 TTTAACACAC ATCCCTACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA 

251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGG CGGGGGTCGG CAAACTTGCC 

301 CGCTTAGGCG CGAAATTCAG CACAAGGGCG GTTCCCTATG TCGGAACAGC 

351 CCTTTTAGCC CACGACGTAT ACGAAACTTT CAAAGAAGAC ATACAGGCAC 

4 01 GAGGCTACCA ATACGACCCC GAAACCGACA AATTTGCAAA GGTCTCAGGC 

451 TAA 



NP: 267) : 



PRF72) : 



51 
101 
151 



1 



MVIKYTNLNF AKLSI IAILM MYSFEANANA VXISETVSVD TGQGAKIHKF 
VPKNSKTYSS DLIKTVDLTH XPTGAKARIN AKITASVSRA GVLAGVGKLA 
RLGAKFSTRA VPYVGTALLA HDVYETFKED IQARGYQYDP ETDKFVKGYE 
YSNCLWYEDK RRINRTYGCY GVD . . 
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This corresponds to the amino acid sequence [<SEQ ID 270; ORF72-l>] (SEP ID NO: 270; 
PRF72-1): 



1 MVIKYTNLNF AKLSIIAILM MYSFEANAN A VKISETVSVD TGQGAKIHKF 
51 VPKNSKTYSS DL I KTVDLTH I PTGAKAR IN AKITASVSRA GVLAGVGKLA 
101 RLGAKFSTRA VPYVGTALLA HDVYETFKED IQARGYQYDP ETDKFAKVSG 
151 * 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N. meningitidis (strain A) 



ORF72 rSEO ID NO: 268) shows 98.0% identity over a 147aa overlap with an ORF (ORF72a) 
(SEP ID NO: 272) from strain A of N. meningitidis: 



10 20 30 40 50 60 

MVI KYTNLNFAKLS 1 1 AI LMMYS FEANAN AVX I S ETVS VDTGQGAKI HKFVPKNS KTYSS 

MllllilllMIIIMIIMIIIIil I I 1 1 M I M I M 1 1 1 II 1 1 1 1 1 II 1 1 1 1 M 

MVI KYTNLNFAKLS I I AI LMMYS FEANAN AVKI S ETVS VDTGQGAKIHKFVPKNS KTYSS 
10 20 30 40 50 60 



orf 72 .pep 
orf 72a 



70 80 90 100 110 120 

orf 72 . pep DLIKTVDLTHXPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA 

Illlllllll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 U 1: 1 1 M I M 1 1 M 1 1 1 1 1 1 1 II Ml 1 1 

orf 72a DLIKTVDLTHIPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA 

70 80 90 100 110 120 



130 140 150 160 170 

orf 72 . pep HDVYETFKEDIQARGYQYDPETDKFVKGYEYSNCLWYEDKRRINRTYGCYGVD 

I I I I I I I I I I I I I I I I I I I I I I I M 
orf 72a HDVYETFKED I QARGYQYDPETDKFAKVSGX 

130 140 150 

The complete length PRF72a nucleotide sequence [<SEQ ID 27 1>] (SEPIDNP: 271) is: 



1 ATGGTCATAA AATATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC 

51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAATGCA GTAAAAATAT 

101 CTGAAACTGT TTCAGTTGAT ACCGGACAAG GTGCGAAAAT TCATAAGTTT 

151 GTACCTAAAA ATAGTAAAAC TTATTCATCT GATTTAATAA AAACGGTAGA 

2 01 TTTAACACAC ATCCCTACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA 

2 51 CCGCCAGCGT ATCCCGCGCC GGCGTATTGG CGGGGGTCGG CAAACTTGCC 
301 CGCTTAGGCG CGAAATTCAG CACAAGGGCG GTTCCCTATG TCGGAACAGC 

3 51 CCTTTTAGCC CACGACGTAT ACGAAACTTT CAAAGAAGAC ATACAGGCAC 

4 01 GAGGCTACCA ATACGACCCC GAAACCGACA AATTTGCAAA GGTCTCAGGC 
451 TAA 

This encodes a protein having amino acid sequence [<SEQ ID 272>] (SEP ID NP: 272) : 



1 MVIKYTNLNF AKLSIIAILM MYSFEANAN A VKISETVSVD TGQGAKIHKF 
51 VPKNSKTYSS DL I KTVDLTH I PTGAKAR IN AKITASVSRA GVLAGVGKLA 
101 RLGAKFSTRA VPYVGTALLA HDVYETFKED IQARGYQYDP ETDKFAKVSG 
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151 * 

ORF72a (SEP ID NO: 272) and ORF72-1 (SEP ID NO: 270) show 100.0% identity in 150 aa 
overlap: 



5 10 20 30 40 50 60 

orf 72a . pep MVIKYTNLNFAKLSIIAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS 
I I I I I I I I I I II I I I II M II II M I II I I I I I I I I I II I I I I I I I i M I I I 1 I I I I I 
orf72-l MVIKYTNLNFAKLSIIAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS 

10 20 30 40 50 60 



10 70 80 90 100 110 120 

orf 72a . pep DL I KTVDLTH I PTGAKAR I NAKI TAS VSRAGVLAGVGKLARLGAKFSTRAVP YVGTALLA 

II IIIIIIIIIIIIMIMIIIII llllllllllllllllllllll lllllllllll 
orf 72 - 1 DLIKTVDLTHIPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA 

70 80 90 100 110 120 

15 130 140 150 

orf 72a . pep HDVYETFKEDIQARGYQYDPETDKFAKVSGX 

I I I I I I II I I II I I I I I I I I I I I I I I I I I 

orf 72 - 1 HDVYETFKEDIQARGYQYDPETDKFAKVSGX 

130 140 150 

20 Homology with a predicted PRF from N. gonorrhoeae 

PRF72 (SEP ID NP: 268) shows 89% identity over a 173aa overlap with a predicted PRF 
(PRF72.ng) (SEP ID NP: 274) from N. gonorrhoeae: 



30 



orf 72 .pep 


MV I KYTNLNFAKLS 1 1 A I LMMYS FEANANAVX I S ETVS VDTGQGAK I HKFVP KNS KT YS S 


60 


II 1 : 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 IMhllllllllhlllllhh III 




orf 72ng 


MVTKHTNLNFAKLSIIAILMMYSFEANANAVKISETLSVDTGQGAKVHKFVPKSSNIYSS 


60 


orf 72 .pep 


DLIKTVDLTHXPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA 


120 


II Mill 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 i M 1 1 1 M 1 1 1 M 1 1 1 1 1 M 1 1 M 1 




orf 72ng 


DLTKAVDLTH I PTGAKAR I NAKI TAS VSRAGVLSGVGKLVRQGAKFGTRAVP YVGTALLA 


120 


orf 72 .pep 


HDVYETFKEDIQARGYQYDPETDKFVKGYEYSNCLWYEDKRRINRTYGCYGVD 


173 




lllllllllllllll : II II III IIM llhllll Mh 1 III III III 1 II 




orf 72ng 


HDVYETFKEDIQARGCRYDPETDKFVKGYEYANCLWYEDERRINRTYGCYGVDSSIMRLM 


180 



An PRF72ng nucleotide sequence [<SEQ ID 273>] (SEP ID NP: 273) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 274>] (SEP ID NP: 274) : 



35 1 MVTKHTNLNF AKLSIIAILM MYS FEANAN A VKISETLSVD TGQGAKVHKF 

51 VPKSSNIYSS DLTKAVDLTH I PTGAKAR IN AKITASVSRA GVLSGVGKLV 

101 RQGAKFGTRA VP YVGTALLA HDVYETFKED IQARGCRYDP ETDKFVKGYE 

151 YANCLWYEDE RRINRTYGCY GVDSSIMRLM PDRSRFPEVK QLMESQMYRL 

201 ARPFWNWRKE ELNKLSSLDW NNFVLNRCTF DWNGGGCAVN KGDDFRAGAS 

40 251 FSLGRNPKYK EEMDAKKPEE ILSLKVDADP DKYIEATGYP GYSEKVEVAP 

301 GTKVNMGPVT DRNGNPVQVA ATFGRDAQGN TTADVQVIPR PDLTPASAEA 

351 PHAQPLPEVS PAENPANNPD PDENPGTRPN PEPDPDLNPD ANPDTDGQPG 

401 TSPDSPAVPD RPNGRHRKER KEGEDGGLSC DYFPEILACQ EMGKPSDRMF 
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451 HDISIPQVTD DKTWSSHNFL PSNGVCPQPK TFHVFGRQYR ASYEPLCVFA 

501 EKIR FAVLLA FIIMSAFWF G SLGGE* 

After further analysis, the following gonococcal DNA sequence [<SEQ ID 275>] (SEP ID NO; 
5 275) was identified: 

1 ATGGTCACAA AACATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC 

51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAATGCA GTAAAAATAT 

101 CTGAAACTCT TTCGGTTGAT ACCGGACAAG GCGCGAAAGT TCATAAGTTC 

151 GTTCCTAAAT CAAGTAATAT TTATTCATCT GATTTAACAA AAGCGGTAGA 

10 201 TTTAACGCAT ATCCCCACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA 

251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGT CGGGGGTCGG CAAACTTGTC 

301 CGCCAAGGCG CGAAATTCGG CACAAGGGCG GTTCCCTATG TCGGAACAGC 

351 CCTTTTAGCC CACGACGTAT ACGAAACTTT CAAAGAAGAC ATACAGGCAC 

4 01 GAGGCTGCCG ATACGATCCC GAAACCGACA AATTT 



15 



This corresponds to the amino acid sequence [<SEQ ID 276; ORF72ng-l>] (SEP ID NO: 276; 
PRF72ng-l) : 



1 MVTKHTNLNF AKLSI IAILM MYSFEANAN A VKISETLSVD TGQGAKVHKF 
51 VPKSSNIYSS DLTKAVDLTH IPTGAKARIN AKITASVSRA GVLSGVGKLV 
20 101 RQGAKFGTRA VPYVGTALLA HDVYETFKED IQARGCRYDP ETDKF 

ORF72ng-l (SEP ID NO: 276) and ORF72M (SEP ID NO: 270) show 89.7% identity in 145 aa 
overlap: 



10 20 30 40 50 60 

25 orf 72ng-l .pe MVTKHTNLNFAKLS 1 1 A I LMMYS FEANANAVKI S ETLSVDTGQGAKVHKFVPKS SN I YS S 

II [MM I M i I I i ! 1 I I I I I I 1 I 1 I I I I ■ I ! M : I I I I 1 1 I I I ^ ! I I i I I : : III 
orf 72-1 MVIKYTNLNFAKLSI IAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS 

10 20 30 40 50 60 

70 80 90 100 110 120 

30 orf 72ng-l .pe DLTKAVDLTH I PTGAKAR I NAKITASVSRAGVLSGVGKLVRQGAKFGTRAVPYVGTALLA 

II I Ml M I I I M I I I I I 111 I I I I I M I Ml I I M lllhlilllllilllll 
orf 72 - 1 DLI KTVDLTHI PTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA 

70 80 90 100 110 120 

130 140 
35 orf72ng-l.pe HDVYETFKEDIQARGCRYDPETDKF 

Illllllllllllll =11111111 
orf 72 - 1 HDVYETFKED I QARGYQYDPETDKFAKVSGX 

130 140 150 



Based on this analysis, including the presence of a putative leader sequence and transmembrane 
40 domains in the gonococcal protein, it is predicted that the proteins from N. meningitidis and 
N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
raising antibodies. 
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Example 33 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 277>] (SEP ID 
NO: 277) : 



1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAGATTAT 

51 GTCGATTGTG TGGGTTGCCG ATTGGCTGGG CGGCGGCTGG ACGTTGTTTT 

101 TGATGGCGGC AGGTTTTGCC GCCGGCGTGC TGATGCTCAG GCAAACCGGG 

151 GCTGACCGGT CTTTTATTGG CGGGCGCGGC AATGAGAAGC GGCGGGAAGG 

2 01 TATCCGTTTA TCAGATGTTG TGGCCTATC . . 

This corresponds to the amino acid sequence [<SEQ ID 278; ORF73>] (SEP ID NO: 278; 
PRF73) : 



1 MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAAGFA AGVLMLRQTG 
51 LTGLLLAGAA MRSGGKVSVY QMLWPI . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 279>] (SEP ID NO: 279) : 



1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAGATTAT 

51 GTCGATTGTG TGGGTTGCCG ATTGGCTGGG CGGCGGCTGG ACGTTGTTTT 

101 TGATGGCGGC AGGTTTTGCC GCCGGCGTGC TGATGCTCAG GCATACGGGG 

151 CTGTCCGGTC TTTTATTGGC GGGCGCGGCA ATGAGAAGCG GCGGGAGGGT 

201 ATCCGTTTAT CAGATGTTGT GGCCTATCCG TTATACGGTG GCGGCTGTGT 

251 GTCTGATGAG TCCGGGATTC GTATCCTCGG TGTTGGCGGT ATTGCTGCTG 

3 01 CTGCCGTTTA AGGGAGGGGC AGTGTTGCAG GCAGGAGGTG CGGAAAATTT 
351 TTTCAACATG AACCAATCGG GCAGAAAAGA GGGCTTTTCC CGCGATGACG 

4 01 ATATTATCGA GGGAGAATAT ACGGTTGAAG AGCCTTACGG CGGCAATCGT 
4 51 TCCCGAAACG CCATCGAACA CAAAAAAGAC GAATAA 

This corresponds to the amino acid sequence [<SEQ ID 280; PRF73-1>] (SEP ID NP: 280; 
PRF73-1) : 



1 MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAAGFA AGVLMLRHTG 

51 LSGLLLAGAA MRSGGRVSVY QMLWPIRYTV AAVC LMSPGF VSSVLAVLLL 

101 LPFKGGAVLQ AGGAENFFNM NQSGRKEGFS RDDDIIEGEY TVEEPYGGNR 

151 SRNAIEHKKD E* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N. meningitidis (strain A) 



PRF73 (SEP ID NP: 278) shows 90.8% identity over a 76aa overlap with an PRF (PRF73a) 
(SEP ID NP: 282) from strain A of N. meningitidis: 



10 20 30 40 50 60 

orf 73 . pep MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRQTGLTGLLLAGAA 
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orf 73a MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAATFAAGWMLRHTGLSGLLLAGAA 

10 20 30 40 50 60 

70 

orf 73 . pep MRSGGKVSVYQMLWPI 
IIIIMIII III I 

or f 7 3 a MRSGGRVS VYXMLWX I RYTVAAVC XMS PGFVS S VXAVLLXL PFKGGAVLQAGGAENFFNM 

The complete length ORF73a nucleotide sequence [<SEQ ID 28 1>] (SEP ID NO: 281) is: 



10 



15 



20 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 



ATGAGATTTT 
GTCGATTGTG 
TAATGGCGGC 
CTGTCCGGTC 
ATCCGTTTAT 
GTCNGATGAG 
CTNCCGTTTA 
TTTCAACATG 
ATATTATCGA 
TTCCGAAACG 



TCGGTATCGG 
TGGGTTGCCG 
AACCTTTGCC 
TTTTATTGGC 
CANATGTTGT 
TCCGGGATTC 
AGGGAGGTGC 
AACCANTCGG 
GGGGGAATAT 
CCNTNGAACA 



TTTTTTGGTG 
ATTGGTTGGG 
GCCGGCGTGG 
GGGCGCGGCA 
GGCNTATCCG 
GTATCCTCGG 
AGTGTTGCAG 
GCAGAAAAGA 
ACGGTTGAAG 
CAAAAAAGAC 



CTGCTGTTTT 
CGGCGGTTGG 
TGATGCTCAG 
ATGAGAAGCG 
TTATACGGTG 
TGTNGGCGGT 
GCAGGAGGTG 
NGGCNTTTCC 
ANCCTTACGG 
GAATAA 



TGGAGATTAT 
ACGCTGTTTC 
GCATACGGGG 
GCGGGAGGGT 
GCGGCGGTGT 
ATTGCTGNTG 
CGGAAAATTT 
CGCGATGACG 
CGGCANTCGT 



This encodes a protein having amino acid sequence [<SEQ ID 282>] (SEP ID NO: 282) : 



1 MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAATFA AGWMLRHTG 

51 LSGLLLAGAA MRSGGRVSVY XMLWXIRYTV AAVC XMSPGF VSSVXAVLLX 

101 LPFKGGAVLQ AGGAENFFNM NXSGRKXGXS RDDDIIEGEY TVEXPYGGXR 

25 151 FRNAXEHKKD E* 

ORF73a (SEP ID NO: 282) and ORF73-1 (SEP ID NO: 280) show 91.3% identity in 161 aa 
overlap 



30 



10 20 30 40 50 60 

orf 73a. pep MRFFGIGFLVLLFLEIMSIVWADWLGGGWTLFLMAATFAAGVVMLRHTGLSGLLLAGAA 

MM MMM MMMMMMMMMMIMM M M Ml I M M M M M I M I 

orf 73 - 1 MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRHTGLSGLLLAGAA 

10 20 30 40 50 60 



35 



70 80 90 100 110 120 

orf 73a . pep MRSGGRVS VYXMLWX I RYTVAAVCXMS PGFVS S VXAVLLXLPFKGGAVLQAGGAENFFNM 

MMMM III IIIIMIII IIIIMIII MM 1 1 II I II II 1 1 II 1 1 1 1 1 II 

orf 73 - 1 MRSGGRVS VYQMLWP I RYTVAAVCLMS PGFVS SVLAVLLLLPFKGGAVLQAGGAENFFNM 

70 80 90 100 110 120 



40 



130 140 150 160 

orf 73a . pep NXSGRKXGXSRDDDI IEGEYTVEXPYGGXRFRNAXEHKKDEX 

I MM I 1 1 1 1 1 MM I III MMMI 

orf 73-1 NQSGRKEGFSRDDDIIEGEYTVEEPYGGNRSRNAIEHKKDEX 

130 140 150 160 



Homology with a predicted PRF from N. gonorrhoeae 
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ORF73 (SEP ID NO: 278) shows 92.1% identity over a 76aa overlap with a predicted ORF 
(ORF73.ng) (SEP ID NO: 284) from N. gonorrhoeae: 

orf 73 .pep MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRQTGLTGLLLAGAA 60 

I I I I I I I I I I I I I I I I I I I I Ml I M I I I I I I I I I M 111111111 = 111 = 11111111 
5 orf 73ng MRFFGIGFLVLLFLEIMSIVWADWLGGGWTLFLMAATFAAGVLMLRHTGLSGLLLAGAA 60 

orf 73. pep MRSGGKVS VYQMLWP I 76 

-MINIMUM 

or f 7 3 ng VKSSGKVS VYQMLWP I RYTVAAVCLMS PGFVS S VLAVLLLLPFKGGAVLQAGGAENFFNM 120 

10 The complete length PRF73ng nucleotide sequence [<SEQ ID 283>] (SEP ID NP: 283) is: 



1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAAATTAT 

51 GTCGATTGTG TGGGTTGCCG ATTGGCTGGG CGGCGGTTGG AcgcTGTTTC 

101 TAATGGCGGC AACCTTTGCC GCCGGTGTGC TGATGCTCAG GCATAcggGG 

151 CTGTCCGGTC TTTTATTGGC TGGCGCGGCG GTAAAAagta gtgGGAAGGT 

15 201 ATCTGTTTAT CagatgtTGT GGCCTATCCG TTATAcggtg gcggcggtgT 

251 GTCTGatgag tCcggGATTC GTATCCTccg tgttggCGGT ATTGCTGCTG 

301 CTGCcgttta aggGaggGgc agtgttgcag gcaggaggtg cggaaaATTT 

351 TTTCAACATg aaCcaatcgg gcagaaAaga gggatttttc cacgatgacg 

401 atattatcga gggagaatat acggttgaaa aacctgacgg cggcaatcgt 

20 4 51 tcccgaAAcg ccatcgaaca cgaaaAagac gaataA 



This encodes a protein having amino acid sequence [<SEQ ID 284>] (SEP ID NP: 284) : 



1 MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAATFA AGVLMLRHTG 
51 LSGLLLAGAA VKSSGKVSVY QMLWPIRYTV AAVC LMSPGF VSSVLAVLLL 
25 101 LPFKGGAVLQ AGGAENFFNM NQSGRKEGFF HDDDIIEGEY TVEKPDGGNR 

151 SRNAIEHEKD E* 

PRF73ng (SEP ID NP: 284) and PRG73-1 (SEP ID NP: 280) show 93.8% identity in 161 aa 
overlap 



30 10 20 30 40 50 60 

orf 73 - 1 . pep MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRHTGLSGLLLAGAA 

1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 M I II 1 1 1 1 1 1 M 1 1 M 1 1 1 1 

orf 73ng MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAATFAAGVLMLRHTGLSGLLLAGAA 

10 20 30 40 50 60 



35 70 80 90 100 110 120 

orf 73 - 1 . pep MRSGGRVSVYQMLWP I RYTVAAVCLMS PGFVS S VLAVLLLLPFKGGAVLQAGGAENFFNM 

-IMMM M M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 M 1 1 1 M 1 1 M I II 1 1 1 1 1 1 1 1 1 M 

or f 7 3 ng VKSSGKVS VYQMLWP I RYTVAAVCLMS PGFVSS VLAVLLLLP FKGGAVLQ AGGAENFFNM 

70 80 90 100 110 120 



40 130 140 150 160 

orf 73-1. pep NQSGRKEGFSRDDDIIEGEYTVEEPYGGNRSRNAIEHKKDEX 

IMMM MMMI MMM I II 1 1 M 1 1 1 h 1 1 1 1 

orf 73ng NQSGRKEGFFHDDDIIEGEYTVEKPDGGNRSRNAIEHEKDEX 

130 140 150 160 

45 
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Based on this analysis, including the presence of a putative leader sequence and putative 
transmembrane domain in the gonococcal protein, it is predicted that the proteins from 
N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or 
diagnostics, or for raising antibodies. 



Example 34 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 285>] (SEP ID 
NO: 285) : 



ATTCTT.ATG TTTCAGAAAC ATTTGCAGAA 
GAGGGACATT ATACGTGGTT GCCACGCCCA 
ACCCTGCGCG CTTTGGCGGT ATTGCAAAAG 
AGACACGCGC GTTACCGCAC AGCTTTTGAG 
AACTCGTCAG TGTGCGCGAA CACAACGAAC 
GTCGGCTATC TTTCAGACGG CATGGTTGTG 
TACGCCGGCC GTGTGCGACC CGGGCGCGAA 
AGGCCGGGTT TAAAGTCGTT CCCGTCGTGG 
GCTTTGAGCG TGGCCGGTGT GGAAGGATCC 
TGTACCGCCG AAATCGGGAG AACGCAGGAA 
GGGCGGCGTT TCCTATCGTC ATGTTTGAAA 
GCGCTTGCCG ATATGGCGGA ACTGTTCCCC 
GCGCGAAATT ACGAAAACGT TTGAAACGTT 
AAATTCAGAC GGCATTGTCT GCCGACGGCG 
GTGTTGGTGC TTTATCCGGC GCAGGATGAA 
GTCCGCGCAA AACATCATGA AAATCCTCAC 
AGGCGGCGGA GCTTGCTGCC AAAATCACGG 
TACGAT . . 



1 


ATGTTTGTTT 


TTCAGACGGC 


51 


AGCCTCCGAC 


AGCGTCGTCG 


101 


TCGGCAATTT 


GGCGGACATT 


151 


GCG 


GCCGA 


201 


CGCGTACGGC 


ATTCAGGGCA 


251 


GGCAGATGGC 


GGACAAGATT 


301 


GCACAGGTTT 


CCGATGCGGG 


351 


ACTCGCCCGC 


CGCGTGCGTG 


401 


GCGCAAC . GC 


GGTGATGGCG 


451 


GATTTTTATT 


TCAACGGTTT 


501 


ACTGTTTGCC 


AAATGGGTGC 


551 


CGCCGCACCG 


CATCGGTGCA 


601 


GAACGCCGAT 


TAATGCTGGC 


651 


CTTAAGCGGC 


ACGGTTGGGG 


701 


ACCAATCGCG 


CGGCGAGATG 


751 


AAACACGAAG 


GCTTGTCCGA 


801 


AGCCGAGCTG 


CCGACCAAAC 


851 


GCGAGGGAAA 


GAAAGCTTTG 



This corresponds to the amino acid sequence [<SEQ ID 286; ORF75>] (SEP ID NO: 286; 
ORF75) : 



1 MFVFQTAFXM FQKHLQKASD SWGGTLYW ATPIGNLADI TLRALAVLQK 

51 A. . . . AEDTR VTAQLLSAYG IQGKLVSVRE HNERQMADKI VGYLSDGMW 

101 AQVSDAGTPA VCDPGAKLAR RVREAGFKW PWGAXAVMA ALSVAGVEGS 

151 DFYFNGFVPP KSGERRKLFA KWVRAAFPIV MFETPHRIGA ALADMAELFP 

201 ERRLMLARE I TKTFETFLSG TVGEIQTALS ADGDQSRGEM VLVLYPAQDE 

251 KHEGLSESAQ NIMfCILTAEL PTKQAAELAA KITGEGKKAL YD.. 

Further work revealed the complete nucleotide sequence [<SEQ ID 287>] (SEP ID NP: 287) : 



1 ATGTTTCAGA AACATTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC 

51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCGGAC ATTACCCTGC 

101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATCTGTGC CGAAGACACG 

151 CGCGTTACCG CACAGCTTTT GAGCGCGTAC GGCATTCAGG GCAAACTCGT 

2 01 CAGTGTGCGC GAACACAACG AACGGCAGAT GGCGGACAAG ATTGTCGGCT 
251 ATCTTTCAGA CGGCATGGTT GTGGCACAGG TTTCCGATGC GGGTACGCCG 

3 01 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GTGAGGCCGG 
3 51 GTTTAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTGATG GCGGCTTTGA 
401 GCGTGGCCGG TGTGGAAGGA TCCGATTTTT ATTTCAACGG TTTTGTACCG 
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4 51 CCGAAATCGG GAGAACGCAG GAAACTGTTT GCCAAATGGG TGCGGGCGGC 

501 GTTTCCTATC GTCATGTTTG AAACGCCGCA CCGCATCGGT GCGACGCTTG 

551 CCGATATGGC GGAACTGTTC CCCGAACGCC GATTAATGCT GGCGCGCGAA 

601 ATTACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA 

5 651 GACGGCATTG TCTGCCGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG 

701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCCGCG 

751 CAAAACATCA TGAAAATCCT CACAGCCGAG CTGCCGACCA AACAGGCGGC 

801 GGAGCTTGCT GCCAAAATCA CGGGCGAGGG AAAGAAAGCT TTGTACGATC 

851 TGGCTCTGTC TTGGAAAAAC AAATAG 

10 

This corresponds to the amino acid sequence [<SEQ ID 288; ORF75-l>] (SEP ID NO: 288; 
ORF75-1) : 

1 MFQKHLQKAS DSWGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT 

51 RVTAQLLSAY GIQGKLVSVR EHNERQMADK IVGYLSDGMV VAQVSDAGTP 

15 101 AVCDPGAKLA RRVREAGF KV VPWGASAVM AALSVA GVEG SDFYFNGFVP 

151 PKSGERRKLF AKWVRAAFPI VMFETPHRIG ATLADMAELF PERRLMLARE 

201 ITKTFETFLS GTVGEIQTAL SADGNQSRGE MVLVLYPAQD EKHEGLSESA 

251 QNIMKILTAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K* 

20 Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF75 (SEP ID NO: 286) shows 95.8% identity over a 283aa overlap with an ORF (ORF75a) 
(SEP ID NO: 290) from strain A of N. meningitidis: 

10 20 30 40 50 60 

25 orf75.pep MFVFQTAFXMFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKAXXXXAEDTR 

1 1 1 1 , 11 1 1 1 M 1 1 II I 1 1 1 1 1 1 1 1 1 M I ! 1 1 1 1 1 1 1 1 1 1 Mill 

O r f 7 5 a M FQKHLQKASDS WGGTL YWAT P I GNLAD I TLRALAVLQKAD 1 1 CAEDTR 

10 20 30 40 50 

70 80 90 100 110 120 

30 or f 7 5 . pep VTAQLLSAYGIQGKLVS VREHNERQMADKI VGYLSDGMWAQVSDAGTPAVCDPGAKLAR 

I I I I I I I I I I I I I I I I I II I I I M I I I I I I I II I I I I I . I I I I I I I I I I I I I I I I I I I 
or f 7 5a VTAQLLSAYGIQGKLVS WEHNERQMADKI VGYLSDGMWAQVSDAGTPAVCDPGAKLAR 

60 70 80 90 100 110 

130 140 150 160 170 180 

35 orf 75 . pep RWEAGFK WPWGAXAVMAALSVA GVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIV 

1 1 1 1 : 1 ! 1 1 1 1 1 1 1 MINIMI I 1 1 1 1 1 1 1 1 1 1 1 1 1 L 1 1 1 1 1 1 i 1 1 1 1 1 = 1 1 1 = I 

orf 75a RVREVGF KWPWGASAVMAALSVA GVAGSDFYFNGF^PPK^GERRKIjFAKWVRVAFPVV 
120 130 140 150 160 170 

190 200 210 220 230 240 

40 orf 75 . pep MFETPHRIGAALADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGDQSRGEM 

I I I M I I I M : I I I M M M I I I I I I I I I I I I I I I I I I I I I I I 1 I 1 I : I 1 : I I I M I 
orf 75a MFETPHRIGATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEM 
180 190 200 210 220 230 

250 260 270 280 290 

45 orf 75. pep VLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYD 

i I J 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 L I 
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or f 75a VLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNK 
240 250 260 270 280 290 

orf75a X 

5 The complete length ORF75a nucleotide sequence [<SEQ ID 289>] (SEP ID NO: 289^ is: 

1 ATGTTTCAGA AACATTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC 

51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCGGAC ATTACCCTGC 

101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATCTGTGC CGAAGACACG 

151 CGCGTTACCG CGCAGCTTTT GAGCGCGTAC GGCATTCAGG GCAAACTCGT 

10 2 01 CAGCGTGCGC GAACACAACG AACGGCAGAT GGCGGACAAG ATTGTCGGCT 

2 51 ATCTTTCAGA CGGCATGGTT GTGGCACAGG TTTCCGATGC GGGTACGCCG 

3 01 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GTGAGGTCGG 
351 GTTTAAAGTT GTCCCTGTTG TCGGCGCAAG CGCGGTGATG GCGGCTTTGA 

4 01 GTGTGGCTGG TGTGGCGGGA TCCGATTTTT ATTTCAACGG TTTTGTACCG 
15 4 51 CCGAAATCGG GCGAACGTAG GAAATTGTTT GCCAAATGGG TGCGGGTGGC 

501 GTTTCCCGTC GTGATGTTTG AAACGCCGCA CCGCATCGGG GCGACGCTTG 

551 CCGATATGGC GGAACTGTTC CCCGAACGCC GATTAATGCT GGCGCGCGAA 

601 ATCACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA 

651 GACGGCATTG GCGGCGGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG 

20 701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCCGCG 

751 CAAAACATCA TGAAAATCCT CACAGCCGAG CTGCCGACCA AACAGGCGGC 

801 GGAGCTTGCC GCCAAAATCA CGGGCGAGGG AAAAAAAGCT TTGTACGATC 

851 TGGCACTGTC TTGGAAAAAC AAATGA 

25 This encodes a protein having amino acid sequence [<SEQ ID 290>] ( SEP ID NO: 290) : 

1 MFQKHLQKAS DSWGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT 

51 RVTAQLLSAY GIQGKLVSVR EHNERQMADK IVGYLSDGMV VAQVSDAGTP 

101 AVCDPGAKLA RRVREVGF KV VPWGASAVM AALSVA GVAG SDFYFNGFVP 

151 PKSGERRKLF AKWVRVAFPV VMFETPHRIG ATLADMAELF PERRLMLARE 

30 201 ITKTFETFLS GTVGEIQTAL AADGNQSRGE MVLVLYPAQD EKHEGLSESA 

251 QNIMKILTAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K* - 

PRF75a (SEP ID NP: 290) and PRF75-1 (SEP ID NP: 288) show 98.3% identity in 291 aa 
overlap: 

35 10 20 30 40 50 60 

orf 75a . pep MFQKHLQKASDSWGGTLYWATPIGNLADITLRALAVLQKADI ICAEDTRVTAQLLSAY 

Ml II II II III MINI I MINI III IIIMIIIIIIII Mill MM MINIMI I 

orf 75-1 MFQKHLQKASDSWGGTLYWATP I GNLAD I TLRALAVLQKAD I ICAEDTRVTAQLLSAY 

10 20 30 40 50 60 

40 70 80 90 100 110 120 

orf 7 5a . pep GIQGKLVSVREHNERQMADKI VGYLSDGMWAQVSDAGTPAVCDPGAKLARRVREVGFKV 

1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I M 

or f 7 5 - 1 GIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREAGFKV 

70 80 90 100 110 120 

45 130 140 150 160- 170 180 

orf 75a . pep VPWGASAVMAALSVAGVAGSDFYFNGFVPPKSGERRKLFAKWVRVAFPVVMFETPHRIG 

' I I I I I I I I ! i I I I i I I I I I I I I I i I I I I I I I I I I I I II I I: I i h I I I I II I I I 
orf 75-1 VPWGASAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPI VMFETPHRIG 

130 140 150 160 170 180 
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190 200 210 220 230 240 

orf75a.pep m ATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQD 

I I II I I I I I Ml I I I I I I I M I I II I II I I I II I M I I M I I I M I II I Ml III I II 
orf 75-1 ATLAD^4AELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGNQSRGEMVLVLYPAQD 
5 190 200 210 220 230 240 

250 260 270 280 290 

orf 75a . pep EKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX 
I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II 
orf 75-1 EKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX 
10 250 260 270 280 290 

Homology with a predicted ORF from N. gonorrhoeae 

ORF75 rSEO ID NO: 286) shows 93.2% identity over a 292aa overlap with a predicted ORF 
(ORF75.ng) (SEP ID NO: 292) from N. gonorrhoeae: 

orf 75 . pep MFVFQTAFXMFQKHLQKASDS WGGTLYWATP I GNLAD I TLRALAVLQKA AEDTR 56 

15 I 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill 

orf 75ng MSVFQTAFFMFQKHLQKASDSWGGTLYWATPIGNLADITLRALAVLQKADIICAEDTR 60 

orf 75 . pep VTAQLLSAYGIQGKLVSWEHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLAR 116 

II I II I I I II I I Ml I II I I I I I I M I I M M M I I I M II II M I II II II M I I M I 
orf 75ng VTAQLLSAYGIQGRLVSVREHNERQMADKVIGFLSDGLWAQVSDAGTPAVCDPGAKLAR 120 

20 orf 75 .pep RWEAGFKWPWGAXAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIV 176 

IIIMIIIIIIIIII lllllllllll I II I II I 1 1 1 1 I M I M II II M I II 1 1 M 

or f 7 5ng RVREAGFKWPWGASAVMAALS VAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPW 180 

orf 75 .pep MFETPHRIGAALADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGDQSRGEM 236 

II I II 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M M 1 1 1 1 II I M 1 1 M I II 

25 orf 75ng MFETPHRIGATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEM 240 

orf 75 . pep VLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYD 288 

I I II I II I I II I I I I II I I I I I I I I : I II I I I I II I II I I I I I I II I I I I I 
orf 75ng VLVLYPAQDEKHEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLALSWKNK 300 

30 An ORF75ng nucleotide sequence [<SEQ ID 29 1>] (SEP ID NO: 291) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 292>] (SEP ID NP: 292) : 

1 MSVFQTAFFM FQKHLQKASD SWGGTLYW ATP I GNLAD I TLRALAVLQK 
51 ADIICAEDTR VTAQLLSAYG IQGRLVSVRE HNERQMADKV IGFLSDGLW 
101 AQVSDAGTPA VCDPGAKLAR RVREAGF KW PWGASAVMA ALSVA GVAES 
35 151 DFYFNGFVPP KSGERRKLFA KWVRAAFPW MFETPHRIGA TLADMAELFP 

201 ERRLMLARE I TKTFETFLSG TVGEIQTALA ADGNQSRGEM VLVLYPAQDE 
251 KHEGLSESAQ NAMKILAAEL PTKQAAELAA KITGEGKKAL YDLALSWKNK 
301 * 

40 After further analysis, the following gonococcal DNA sequence [<SEQ ID 293>] (SEP ID NP: 
293) was identified: 



1 ATGTTTCAGA AACACTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC 
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51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCAGAC ATTACCCTGC 

101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATTTGTGC CGAAGACACG 

151 CGCGTTACTG CGCAGCTTTT GAGCGCGTAC GGCATTCAGG GCAGGTTGGT 

201 CAGTGTGCGC GAACACAACG AGCGGCAGAT GGCGGACAAG GTAATCGGTT 

251 TCCTTTCAGA CGGCCTGGTT GTGGCGCAGG TTTCCGATGC GGGTACGCCG 

301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GCGAAGCAGG 

351 GTTCAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTAATG GCGGCGTTGA 

4 01 GTGTGGCCGG TGTGGCGGAA TCCGATTTTT ATTTCAACGG TTTTGTACCG 

4 51 CCGAAATCGG GCGAACGTAG GAAATTGTTT GCCAAATGGG TGCGGGCGGC 

501 ATTTCCTGTC GTCATGTTTG AAACGCCGCA CCGAATCGGG GCAACGCTTG 

551 CCGATATGGC GGAATTGTTC CCCGAACGCC GTCTGATGCT GGCGCGCGAA 

601 ATGACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA 

651 GACGGCATTG GCGGCGGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG 

701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCTGCG 

751 CAAAATGCGA TGAAAATCCT TGCGGCCGAG CTGCCGACCA AGCAGGCGGC 

801 GGAGCTTGCC GCCAAGATTA CAGGTGAGGG CAAAAAGGCT TTGTACGATT 

851 TGGCACTGTC GTGGAAAAAC AAATGA 

This corresponds to the amino acid sequence [<SEQ ID 294; ORF75ng-l>] (SEP ID NO: 294; 
PRF75ng-l) : 

1 MFQKHLQKAS DSWGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT 

51 RVTAQLLSAY GIQGRLVSVR EHNERQMADK VIGFLSDGLV VAQVSDAGTP 

101 AVCDPGAKLA RRVREAGF KV VPWGASAVM AALSVA GVAE SDFYFNGFVP 

151 PKSGERRKLF AKWVRAAF P V VMFETPHRIG ATLADMAELF PERRLMLARE 

201 ITKTFETFLS GTVGEIQTAL AADGNQSRGE MVLVLYPAQD EKHEGLSESA 

251 QNAMKILAAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K* 

ORF75ng-l (SEP ID NO: 294) and ORF75-1 (SEP ID NO: 288) show 96.2% identity in 291 aa 
overlap: 

10 20 30 40 50 60 

orf 75- 1 . pep MFQKHLQKASDS WGGTLYWATP I GNLAD I TLRALAVLQKAD 1 1 CAEDTRVTAQLLS AY 

IIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIII 

orf75ng-l MFQKHLQKASDS WGGTLYWATP I GNLAD I TLRALAVLQKAD 1 1 CAEDTRVTAQLLS AY 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 75-1. pep GIQGKLVSVREHNERQMADKI VGYLSDGMWAQVSDAGTPAVCDPGAKLARRVREAGFKV 

I I I I : I M I I I I I I M I I I i - I : I I I h I I I I I I I I I I' I ! I 1 I I I M I I I I I ■ I 
6rf75ng-l GIQGRLVSVREHNERQMADKVIGFLSDGLWAQVSDAGTPAVCDPGAKLARRVREAGFKV 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 75-1. pep VPWGASAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFP I VMFETPHRIG 

i 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 Mlllllllll Ml I IIIMIIIIIMIMIIIMI 

orf 75ng- 1 VPWGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPWMFETPHRIG 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 75-1 .pep ATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGNQSRGEMVLVLYPAQD 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I : I I II I I I I I I I M I I II I I 
orf 75ng-l ATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQD 

190 200 210 220 230 240 

250 260 270 280 290 
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orf 75-1 .pep EKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX 

I Mill II III I I I I:| I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
orf75ng-l EKHEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLALSWKNKX 
250 260 270 280 290 

Furthermore, ORG75ng-l (SEP ID NO: 294) shows significant homology to a hypothetical Exoli 
protein (SEP ID NO: 1131) : 



sp| P4 5528 |YRAL_ECOLI HYPOTHETICAL 31.3 KD PROTEIN IN AGAI-MTR INTERGENIC REGION 
(F286) 

)gi|606086 (U18997) 0RF_f286 [Escherichia coli] 
10 )gi | 1789535 (AE000395) hypothetical 31.3 kD protein in agai-mtr intergenic region 

[Escherichia coli] Length = 286 
Score = 218 bits (550} , Expect = 3e-56 

Identities = 128/284 (45%), Positives = 171/284 (60%), Gaps = 4/284 (1%) 



Query: 


4 


KHLQKASDS WGGTL YWATP I GNLAD I TLRALAVLQ KAD 1 1 CAEDTRVTAQLLS AYG I Q 


63 






K Q A +S G LY+V TPIGNLADIT RAL VLQ D+I AEDTR T LL +GI 




Sbjct : 


2 


KQHQS ADNSQ - - GQL Y I VPTP I GNLAD I TQRALEVLQAVDL I AAEDTRHTGLLLQHFGIN 


59 


Query : 


64 


GRLVSVREHNERQMADKVIGFLSDGLWAQVSDAGTPAVCDPGAKLARRVREAGFKWPV 


123 






RL ++ +HNE+Q A+ ++ L +G +A VSDAGTP + DPG L R REAG +WP+ 




Sbjct: 


60 


ARLFALHDHNEQQKAETLLAKLQEGQNIALVSDAGTPLINDPGYHLVRTCREAGIRWPL 


119 


Query: 


124 


VGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKIjFAKWW^ 


183 






G A + ALS AG+ F + GF+P KS RR ++ +E+ HR+ +L 




Sbjct : 


120 


PGPCAAITALSAAGLPSDRFCYEGFLPAKSKGRRDALKAIEAEPRTLIFYESTHRLLDSL 


179 


Query : 


184 


ADMAELFPERR-LMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEK 


242 






D+ + E R ++LARE+TKT+ET VGE+ + D N+ +GEMVL++ + 




Sbjct: 


180 


EDIVAVLGESRYWLARELTKTWETIHGAPVGELLAWVKEDENRRKGEMVLIV-EGHKAQ 


238 


Query : 


243 


HEGLSES AQNAMKI LAAELPTKQAAELAAKI TGEGKKALYDLAL 2 86 








EL A + +L AELP K+AA LAA+I G K ALY AL 




Sbjct : 


239 


EEDLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALYKYAL 2 82 





Based on this analysis, including the presence of a putative transmembrane domain in the 
30 gonococcal protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 35 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 295>] fSEO ID 
NO: 295) : 



35 1 ATGAAACAGA AAAAAACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG 

51 TTTTGCGGCA GC.AAAGCAC CCGAAATCGA CCCGGCTTTG 

// 

651 GAGTTGG TCAGAAACCA GTTGGAGCAG GGTTTGAGAC 

701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCC TTTTGGAAGA AAACGGTGTC 
40 751 AAACCGTAA 
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This corresponds to the amino acid sequence [<SEQ ID 296; ORF76>] (SEOIDNO: 296; 
ORF76) : 

1 MKQKKTAAAV IAAMLAGFAA XKAPEIDPAL 

// 

201 ELVRNQLEQG LRQEKARLKI DALLEENGVK 

251 P* 

Further work revealed the complete nucleotide sequence [<SEQ ID 297>] (SEOIDNO: 297) : 

1 ATGAAACAGA AAAAAACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG 

51 TTTTGCGGCA GCCAAAGCAC CCGAAATCGA CCCGGCTTTG GTGGATACGC 

101 TGGTGGCGCA GATCATGCAG CAGGCAGACC GGCATGCGGA GCAGTCCCAA 

151 AAACCGGACG GGCAGGCAAT CCGAAACGAT GCCGTCCGCC GGCTACAAAC 

201 TTTGGAAGTT TTGAAAAACA GGGCATTGAA GGAAGGTTTG GATAAGGATA 

2 51 AGGATGTCCA AAACCGCTTT AAAATCGCCG AAGCGTCTTT TTATGCCGAG 
301 GAGTACGTCC GTTTTCTGGA ACGTTCGGAA ACGGTTTCCG AAGACGAGCT 

3 51 GCACAAGTTT TACGAACAGC AAATCCGCAT GATCAAATTG CAGCAGGTCA 

4 01 GCTTCGCAAC CGAAGAGGAG GCGCGTCAGG CGCAGCAGCT CCTGCTCAAA 
4 51 GGGCTGTCTT TTGAAGGGCT GATGAAGCGT TATCCGAACG ACGAGCAGGC 
501 TTTTGACGGT TTCATTATGG CGCAGCAGCT TCCCGAGCCG CTGGCTTCGC 
551 AGTTTGCCGC GATGAATCGG GGCGACGTTA CCCGCGATCC GGTCAAATTG 
601 GGCGAACGCT ATTATCTGTT CAAACTCAGC GAGGTCGGGA AAAACCCCGA 
651 CGCGCAGCCT TTCGAGTTGG TCAGAAACCA GTTGGAGCAG GGTTTGAGAC 
701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCC TTTTGGAAGA AAACGGTGTC 
751 AAACCGTAA 

This corresponds to the amino acid sequence [<SEQ ID 298; ORF76-l>] (SEP ID NO: 298; 
ORF76-1): 



1 MKQKKTAAAV IAAMLAGFAA AKA PEIDPAL VDTLVAQIMQ QADRHAEQSQ 

51 KPDGQAIRND AVRRLQTLEV LKNRALKEGL DKDKDVQNRF KIAEASFYAE 

101 EYVRFLERSE TVSEDELHKF YEQQIRMIKL QQVSFATEEE ARQAQQLLLK 

151 GLSFEGLMKR YPNDEQAFDG FIMAQQLPEP LASQFAAMNR GDVTRDPVKL 

2 01 GERYYLFKLS EVGKNPDAQP FELVRNQLEQ GLRQEKARLK IDALLEENGV 

251 KP* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N. meningitidis (strain A) 



ORF76 (SEP ID NO: 296) shows 96.7% identity over a 30aa overlap and 96.8% identity over a 
31aa overlap with an ORF (ORF76a) (SEP ID NO: 300) from strain A of N. meningitidis: 



10 20 30 

orf 76 . pep MKQKKTAAAVIAAMLAGFAAXKA PEIDPAL 

1 1 1 1 1 1 1 1 1 1 U M I M MINIMI 

orf 76a MKQKKTAAAVIAAMLAGFAAAKA PEIDPALVDTLVAQIMQQADRHAEQSQKPDGQAIRND 

10 20 30 40 50 60 

// 

70 80 90 

or f 7 6 . pep XELVRNQLEQGLRQEKARLKIDALLEENGVKPX 

MINI Mllllllllll IMIMIIIM 
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or f 76a DVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLKIDAILEENGVKPX 
200 210 220 230 240 250 

The complete length ORF76a nucleotide sequence [<SEQ ID 299>] (SEP ID NO: 299) is: 



5 1 ATGAAACAGA AAAAAACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG 

51 TTTTGCGGCA GCCAAAGCAC CCGAAATCGA CCCGGCTTTG GTGGATACGC 

101 TGGTGGCGCA GATCATGCAG CAGGCAGACC GGCATGCGGA GCAGTCCCAA 

151 AAACCGGACG GGCAGGCAAT CCGAAACGAT GCCGTCCGTC GGCTGCAAAC 

201 TTTGGAAGTT TTGAAAAACA GGGCATTGAA GGAAGGTTTG GATAAGGATA 

10 251 AGGATGTCCA AAACCGCTTT AAAATCGCCG AAGCGTCTTT TTATGCCGAG 

301 GAGTACGTCC GTTTTCTGGA ACGTTCGGAA ACGGTTTCCG AAAGCGCACT 

3 51 GCGTCAGTTT TATGAGCGGC AAATCCGCAT GATCAAATTG CAGCAGGTCA 

4 01 GCTTCGCAAC CGAAGAGGAG GCGCGTCAGG CGCAGCAGCT CCTGCTCAAA 
4 51 GGGCTGTCTT TTGAAGGGCT GATGAAGCGT TATCCGAACG ACGAGCAGGC 

15 501 TTTTGACGGT TTCATTATGG CGCAGCAGCT TCCCGAGCCG CTGGCTTCGC 

551 AGTTTGCAGC GATGAATCGG GGCGACGTTA CCCGCGATCC GGTCAAATTG 

601 GGCGAACGCT ATTATCTGTT CAAACTCAGC GAGGTCGGGA AAAACCCCGA 

651 CGCGCAGCCT TTCGAGTTGG TCAGAAACCA GTTGGAACAA GGTTTGAGAC 

701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCA TTTTGGAAGA AAACGGTGTC 

20 751 AAACCGTAA 



This encodes a protein having amino acid sequence [<SEQ ID 300>] (SEP ID NO: 300) : 

1 MKQKKTAAAV IAAMLAGFAA AKA PEIDPAL VDTLVAQIMQ QADRHAEQSQ 

51 KPDGQAIRND AVRRLQTLEV LKNRALKEGL DKDKDVQNRF KIAEASFYAE 

25 101 EYVRFLERSE TVSESALRQF YERQIRMIKL QQVSFATEEE ARQAQQLLLK 

151 GLSFEGLMKR YPNDEQAFDG FIMAQQLPEP LASQFAAMNR GDVTRDPVKL 

201 GERYYLFKLS EVGKNPDAQP FELVRNQLEQ GLRQEKARLK IDAILEENGV 

251 KP+ 



30 ORF76a (SEP ID NO: 300) and ORF76-1 (SEP ID NO: 298) show 97.6% identity in 252 aa 
overlap: 



10 20 30 40 50 60 

orf 76a . pep MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQKPDGQAIRND 
I I I I I h I I I I I I I I I II I I I II I M I I I I I I I M I I I I I I I I I II I I I I I I I M I I 
35 orf 76 - 1 MKQKKTAAAVIAAM3LAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQKPDGQAIRND 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 76a . pep AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSESALRQF 

I I I I I I I I II M I I M I I I I I M I II II M I I II I I I I I I I I I I I M I I I I I I h h:| 
40 orf 76 - 1 AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSEDELHKF 

70 80 90 100 110 120 



130 140 150 160 170 180 

orf 76a . pep YERQ I RM I KLQQVS FATEEEARQAQQLLLKGLS FEGLMKRYPNDEQAFDGF IMAQQLPE P 
I I : I M i I I I I I I I I II I I II II I II I I I I i I I I I I I II I I I I I I I M I I I I I I II M I 
45 orf 76-1 YEQQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPNDEQAFDGFIMAQQLPEP 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 76a . pep LASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLK 

1 1 1 1 1 II 1 1 1 1 1 M I II I II II 1 1 1 1 1 1 I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I II I M 

50 orf 76-1 LASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLK 
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190 200 210 220 230 240 

250 

orf 76a .pep IDAILEENGVKPX 

Ilhlllllllll 
or f 7 6 - 1 I DALLEENGVKPX 

250 

Homology with a predicted ORF from N. gonorrhoeae 

The aligned aa sequences of ORF76 (SEP ID NO: 296) and a predicted ORF (PRF76.ng) (SEP ID 
NO: 302) from N. gonorrhoeae of the N- and C-termini show 96.7 % and 100% identity in 30 and 
31 overlap, respectively: 

orf 76. pep MKQKKTAAAVI AAMLAGFAAXKAPE IDPAL 30 

1 1 1 1 ! 1 1 1 1 1 1 i 1 1 1 1 III lllllllll 

orf 76ng MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQRPDGQAIRND 60 

// 

orf76 pep ELVRNQLEQGLRQEKARLKIDALLEENGVKP 251 

I I I I I I I I I I ; I I I M I I I I M II I I I I I I I 

orf 7 6ng VTRNPVKLGERYYLFKLGAVGKNPDAQPFELVRNQLEQGLRQEKARLKIDALLEENGVKP 251 

The complete length PRF76ng nucleotide sequence [<SEQ ID 301>] (SEP ID NP: 301) is: 

1 ATGAAACAGA AAAAGACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG 

51 TTTTGCGGCA GCCAAAGCAC CCGAAATCGA CCCGGCTTTG GTGGATACGC 

101 TGGTGGCGCA GATCATGCAG CAGGCAGACC GGCATGCGGA GCAGTCCCAA 

151 AGACCGGACG GGCAGGCAAT CCGAAACGAT GCCGTCCGCC GGCTGCAAAC 

201 TTTGGAAGTT TTGAAAAACA GGGCATTGAA GGAAGGTTTG GATAAGGATA 

251 AGGATGTCCA AAACCGCTTT AAAATCGCCG AAGCGTCTTT TTATGCCGAG 

301 GAGTACGTCC GTTTTCTGGA ACGTTCGGAA ACGGTTTCCG AAAGCGCACT 

3 51 GCGTCAGTTT TATGAGCGGC AAATCCGCAT GATCAAATTG CAGCAGGTCA 

4 01 GCTTCGCAAC CGAAGAGGAG GCGCGTCAGG CGCAGCAGCT CCTGCTCAAA 
4 51 GGGCTGTCTT TTGAAGGGCT GATGAAGCGT TATCCGAACG ACGAGCAGGC 
501 GTTCGACGGT TTCATTATGG CGCAGCAGCT TCCCGAGCCG CTGGCTTcgc 
551 agtttgCCGG TATGAACCGT GGCGACGTTA CCCGCAATCC GGTCAAATTG 
601 GGCGAACGCT ATTACCTGTT CAAACTCGGC GCGGTCGGGA AAAACCCCGA 
651 CGCGCAGCCT TTCGAGTTGG TCAGAAACCA GTTGGAACAA GGTTTGAGGC 
701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCC TTTTGGAaga Aaacggtgtc 
751 AaacCGTAA 

This encodes a protein having amino acid sequence [<SEQ ID 302>] (SEP ID NP: 302) : 



1 MKOKKTAAAV IAAMLAGFAA AKA PE IDPAL VDTLVAQIMQ QADRHAEQSQ 

51 RPDGQAIRND AVRRLQTLEV LKNRALKEGL DKDKDVQNRF KIAEASFYAE 

101 EYVRFLERSE TVSESALRQF YERQIRMIKL QQVSFATEEE ARQAQQLLLK 

151 GLSFEGLMKR YPNDEQAFDG FIMAQQLPEP LASQFAGMNR GDVTRNPVKL 

2 01 GERYYLFKLG AVGKNPDAQP FELVRNQLEQ GLRQEKARLK IDALLEENGV 

251 KP* 

PRF76ng (SEP ID NP: 302) and PRF76-1 (SEP ID NP: 298) show 96.0% identity in 252 aa 



overlap 
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10 20 30 40 50 60 

or f 76-1. pep MKQKKTAAAVI AAMLAGFAAAKAPEIDPALVDTLVAQ IMQQADRHAEQSQKPDGQAIRND 

1 1 1 M i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 i 1 1 M M 1 1 M . 1 1 1 1 1 h 1 1 1 1 1 1 1 1 1 

orf 76ng MKQKKTAAAV I AAMLAGFAAAKAPEIDPALVDTLVAQ I MQQADRHAEQSQRPDGQA I RND 

5 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 76-1. pep AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKI AEASFYAEEYVRFLERSETVSEDELHKF 

II Mill MM IIIIMIIIIIIII lllllllllllllllllilllllllMlh MM 

orf 76ng AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSESALRQF 
10 70 80 90 100 110 120 

130 140 150 160 170 180 

orf 76 - 1 . pep YEQQ I RM I KLQQVS FATEEEARQAQQLLLKGLS FEGLMKRY PNDEQAFDGF I MAQQLPEP 

ll :| 1 1 1 1 M I II M I II M M i I II !l 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 Ml I II 1 1 Ml I 

orf 76ng YERQ I RM I KLQQVS FATEEEARQAQQLLLKGLS FEGLMKRY PNDEQAFDGF I MAQQLP E P 

15 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 76-1 .pep LASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLK 

1 1 1 1 1 1 H 1 1 1 1 1 1 1 = 1 1 1 1 1 1 1 1 1 1 1 1 1 = I M M 1 1 1 II 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 

orf 76ng LASQFAGMNRGDVTRNPVKLGERYYLFKLGAVGKNPDAQPFELVRNQLEQGLRQEKARLK 
20 190 200 210 220 230 240 

250 

orf 76-1. pep IDALLEENGVKPX 

I M 1 1 i 1 1 1 1 1 1 f 

O r f 7 6 ng I D ALLEENGVKPX 

25 250 

Furthermore, ORF76ng (SEP ID NO: 302) shows significant homology to a B.subtilis export 
protein precursor (SEP ID NO: 1132) : 

sp|P2432 7|PRSA_BACSU PROTEIN EXPORT PROTEIN PRSA PRECURSOR ) gi | 98227 | pir | | S15269 
30 33K lipoprotein - Bacillus subtilis ) gi | 39782 (X57271) 33kDa lipoprotein [Bacillus 

subtil is] 

)gi|2226124|gnl|PID|e325181 (Y14077) 33kDa lipoprotein [Bacillus subtilis] 
)gi|263333l|gnl|PID|ell82997 (Z99109) molecular chaperonin [Bacillus subtilis] 
Length =2 92 

35 Score =50.4 bits (118), Expect = le-05 

Identities = 48/199 (24%), Positives = 82/199 (41%), Gaps = 32/199 (16%) 

Query: 70 VLKNRALKEGLDK DKDVQNRFKI AEAS F YAEEYVRFLERSETVSE 114 

VL ++ LDK DK++ N+ K + Y ++Y+ + + E +++ 

Sbjct: 53 VLTQLVQEKVLDKKYKVSDKEIDNKLKEYKTQLGDQYTALEKQYGKDYLKEQVKYELLTQ 112^ 

40 Query: 115 SA LRQFYERQ I RM I KLQQVS FATEEEARQAQQLLLKGLS FEGLMKR YPN 163 

A +++++E 1+ + A ++ A + ++ L KG FE L K Y 

Sbjct: 113 KAAKDNIKVTDADIKEYWEGLKGKIRASHILVADKKTAEEVEKKLKKGEKFEDLAKEYST 172 

Query: 164 DEQAFDG FIMAQQLPEPLASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDA 218 

DAG F Q+E+ + G+V+ DPVK Y++ K +E D 

45 Sbjct: 173 DSSASKGGDLGWFAKEGQMDETFSKAAFKLKTGEVS -DPVKTQYGYHI IKKTEERGKYDD 231 



Query: 219 QPFELVRNQLEQGLRQEKA 237 

EL LEQ L A 
Sbjct: 232 MKKELKSEVLEQKLNDNAA 250 
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Based on this analysis, including the presence of a putative leader sequence and a RGD motif in the 
gonococcal protein, it was predicted that the proteins from ^meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

ORF76-1 (SEP ID NO: 298) (27.8kDa) was cloned in the pET vector and expressed in Exoli, as 
described above. The products of protein expression and purification were analyzed by SDS- 
PAGE. Figure 10A shows the results of affinity purification of the His-fusion protein, Purified 
His-fusion protein was used to immunise mice, whose sera were used for Western blot (Figure 
10B), ELISA (positive result), and FACS analysis (Figure 10C). These experiments confirm that 
ORF76-1 (SEP ID NO: 298) is a surface-exposed protein, and that it is a useful immunogen. 

Example 36 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 303>] (SEP ID 
NO: 303) : 

1 ATGAAAAAAT CTTTCCTTAC GCTTGTTCTG TATTCGTCTT TACTTACCGC 

51 CAGCGAAATT GCCTTACCCC TTGGAATTGG GGATTGAAAC CTTACCGGCG 

101 GCAAAAATTG CGGAAACGTT TGCGCTGACA TTTGTGATTG CTGCGCTGTA 

151 TCTGTTTGCG CGTAATAAGG TGACGCGTTT GTTGATTGCG GTGTTTTTTG 

201 CGTTCAGCAT TATTGCCAAC AATGTGCATT ACGCGGATTA TCAAAGCTGG 

251 ATGACG 

// 

1201 CAAACCGTAT TCGAGCAGCT GCAAAAGACT CCTGACGGCA 

1251 ACTGGCTGTT TGCCTATACC TCCGATCATG GCCAGTATGT TCGCCAAGAT 

1301 ATCTACAATC AAGGCACGGT GCAGCCCGAC AGCTATCTCG TGCCGCTAGT 

1351 GTTGTACAGC CCGGATAAGG CCGTGCAACA GGCTGCCAAC CAGGCTTTTG 

14 01 CGCCTTGCGA GATTGCCTTC CATCAGCAGC TTTCAACGTT CCTGATTCAC 

1451 ACGTTGGGCT ACGATATGCC GGTTTCAGGT TGTCGCGAAG GCTCGGTAAC 

1501 GGGCAACCTG ATTACGGGTG ATGCAGGCAG CTTGAACATT CGCGACGGCA 

1551 AGGCGGAATA TGTTTATCCG CAATGA 

This corresponds to the amino acid sequence [<SEQ ID 304; ORF81>] (SEP ID NO: 304; 
ORF8n : 

1 MKKSFLTLVL YSSLLTASEI AYPLELGIET LPAAKIAETF ALTFVIAALY 

51 LFARNKVTRL LIAVFFAFSI IANNVHYADY QSWMT 

// 

401 ...QTVFEQL QKTPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYLVPLV 
4 51 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT 
501 GNLITGDAGS LNIRDGKAEY VYPQ* 

Further work revealed the complete nucleotide sequence [<SEQ ID 305>] (SEP ID NO: 305) : 



1 ATGAAAAAAT CTTTCCTTAC GCTTGTTCTG TATTCGTCTT TACTTACCGC 
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51 CAGCGAAATT GCCTATCGCT TTGTATTTGG GATTGAAACC TTACCGGCGG 

101 CAAAAATTGC GGAAACGTTT GCGCTGACAT TTGTGATTGC TGCGCTGTAT 

151 CTGTTTGCGC GTTATAAGGT GACGCGTTTG TTGATTGCGG TGTTTTTTGC 

201 GTTCAGCATT ATTGCCAACA ATGTGCATTA CGCGGTTTAT CAAAGCTGGA 

251 TGACGGGCAT CAATTATTGG CTGATGCTGA AAGAGGTTAC CGAAGTCGGC 

301 AGCGCGGGTG CGTCGATGTT GGATAAGTTG TGGCTGCCTG TGTTGTGGGG 

351 CGTGTTGGAA GTCATGTTGT TTTGCAGCCT TGCCAAGTTC CGCCGTAAGA 

4 01 CGCATTTTTC TGCCGATATA CTGTTTGCCT TCCTAATGCT GATGATTTTC 

4 51 GTGCGTTCGT TCGACACGAA ACAAGAGCAC GGTATTTCGC CCAAACCGAC 

501 ATACAGCCGC ATCAAAGCCA ATTATTTCAG CTTCGGTTAT TTTGTCGGAC 

551 GCGTGTTGCC GTATCAGTTG TTTGATTTAA GCAGGATTCC CGCCTTTAAG 

601 CAGCCTGCTC CAAGCAAAAT CGGGCAGGGC AGTGTTCAAA ATATCGTCCT 

651 GATTATGGGC GAAAGCGAAA GCGCGGCGCA TTTGAAGCTG TTTGGCTACG 

701 GACGCGAAAC TTCGCCGTTT TTAACCCGGC TGTCGCAAGC CGATTTTAAG 

751 CCGATTGTGA AACAAAGTTA TTCCGCAGGC TTTATGACTG CAGTGTCCCT 

801 GCCCAGTTTT TTCAATGCGA TACCGCACGC CAACGGCTTG GAACAAATCA 

851 GCGGCGGCGA TACCAATATG TTCCGCCTCG CCAAAGAGCA GGGCTATGAA 

901 ACGTATTTTT ACAGCGCGCA GGCGGAAAAC GAGATGGCGA TTTTGAACTT 

951 AATCGGTAAG AAATGGATAG ACCATCTGAT TCAGCCGACG CAACTTGGCT 

1001 ACGGCAACGG CGACAATATG CCCGATGAGA AGCTGCTGCC GTTGTTCGAC 

1051 AAAATCAATT TGCAGCAGGG CAAGCATTTT ATCGTGTTGC ACCAACGCGG 

1101 TTCGCACGCC CCATACGGCG CATTGTTGCA GCCTCAAGAT AAAGTATTCG 

1151 GCGAAGCCGA TATTGTGGAT AAGTACGACA ACACCATCCA CAAAACCGAC 

12 01 CAAATGATTC AAACCGTATT CGAGCAGCTG CAAAAGCAGC CTGACGGCAA 
1251 CTGGCTGTTT GCCTATACCT CCGATCATGG CCAGTATGTT CGCCAAGATA 

13 01 TCTACAATCA AGGCACGGTG CAGCCCGACA GCTATCTCGT GCCGCTAGTG 
1351 TTGTACAGCC CGGATAAGGC CGTGCAACAG GCTGCCAACC AGGCTTTTGC 

14 01 GCCTTGCGAG ATTGCCTTCC ATCAGCAGCT TTCAACGTTC CTGATTCACA 
14 51 CGTTGGGCTA CGATATGCCG GTTTCAGGTT GTCGCGAAGG CTCGGTAACG 
1501 GGCAACCTGA TTACGGGTGA TGCAGGCAGC TTGAACATTC GCGACGGCAA 
1551 GGCGGAATAT GTTTATCCGC AATGA 

This corresponds to the amino acid sequence [<SEQ ID 306; PRF81-1>] (SEP ID NO: 306; 
PRF81-1) : 



1 MKKSFLTLVL YSSLLTASEI AYRFVFGIET LPAAKIAETF ALTFVIAALY 

51 LFARYKVTRL LIAVFFAFSI IANNVHYAVY QSWMTGINYW LMLKEVTEVG 

101 SAGASMLDKL WLPVLWGVLE VMLFCSLAKF RRKTHFSADI LFAFLMLMIF 

151 VRSFDTKQEH GISPKPTYSR IKANYFSFGY FVGRVLPYQL FDLSRIPAFK 

201 QPAPSKIGQG SVQNIVLIMG ESESAAHLKL FGYGRETSPF LTRLSQADFK 

251 PIVKQSYSAG FMTAVSLPSF FNAIPHANGL EQISGGDTNM FRLAKEQGYE 

301 TYFYSAQAEN EMAILNLIGK KWIDHLIQPT QLGYGNGDNM PDEKLLPLFD 

3 51 KINLQQGKHF IVLHQRGSHA PYGALLQPQD KVFGEADIVD KYDNTIHKTD 

4 01 QMIQTVFEQL QKQPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYLVPLV 
451 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT 
501 GNLITGDAGS LNIRDGKAEY VYPQ* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N. meningitidis (strain A) 



ORF81 (SEP ID NO: 304) shows 84.7% identity over a 85aa overlap and 99.2% identity over a 
1 2 1 aa overlap with an PRF (PRF8 1 a) (SEP ID NP: 308) from strain A of N. meningitidis: 
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10 20 30 40 50 60 

orf 81 .pep MKKSFLTLVLYSSLLTAS EIAYPLELGIETLPAA KIAETFALTFVIAALYLFA RNKVTRL 

lih- IIIIIIIIIIIM : ^ M 1 1 1 Mhl I i 1 1 1 1 M i 1 1 1 1 I ; hill 

O r f 8 1 a MKKSLFVLFLYSSLLTAS E I AYRFVFG I ETLPAA KMAETFALTFVIAALYLFA RYKATRL 

10 20 30 40 50 60 



70 80 
orf 8 1 . pep LIAVFFAFS I IANNVH YADYQSWMT 

I I I I I I I I I I I M I I I lllhl 
or f 8 1 a LIAVFFAFS 1 1 ANNVH YAVYQSW I TG I NYWLMLKE I TEVGGAGASMLDKLW LPALWGVLE 

70 80 90 100 110 120 

// 

120 130 140 

orf 8 1 . pep QTVFEQLQKTPDGNWLFAYTSDHGQYVRQD 

MINIM MMMMMM MMIM 

orf 81a IPHANGLEQISGGDIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLFAYTSDHGQYVRQD 
280 290 300 310 320 330 



150 160 170 180 190 200 

orf 81 . pep IYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTFLIHTLGYDMPVSG 

MMMMMMMMM M MMMMMM MMMM M MM I MIMM M I 

orf 81a IYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTFLIHTLGYDMPVSG 
340 350 360 370 380 390 



210 220 230 

orf 81 . pep CREGS VTGNL I TGDAGSLNI RDGKAE YVYPQX 

I N M 1 1 M I M II M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 

O r f 8 1 a CREGS VTGNL I TGDAGSLN I RDGKAE YVYPQX 

400 410 420 



The complete length ORF81a nucleotide sequence [<SEQ ID 307>] (SEP ID NO: 307) is 



1 ATGAAAAAAT CCCTTTTCGT TCTCTTTCTG TATTCGTCCC TACTTACTGC 

51 CAGCGAAATT GCTTATCGCT TTGTATTCGG AATTGAAACC TTACCGGCTG 

101 CAAAAATGGC AGAAACGTTT GCGCTGACAT TTGTGATTGC TGCGCTGTAT 

151 CTGTTTGCGC GTTATAAGGC AACGCGTTTG TTGATTGCGG TGTTTTTCGC 

201 GTTCAGCATT ATTGCCAACA ATGTGCATTA CGCGGTTTAT CAAAGCTGGA 

251 TAACGGGCAT TAATTATTGG CTGATGCTGA AAGAGATTAC CGAAGTTGGC 

301 GGCGCAGGGG CGTCGATGTT GGATAAGTTG TGGCTGCCTG CGTTGTGGGG 

351 CGTGTTGGAA GTCATGTTGT TTTGCAGCCT TGCCAAGTTC CGCCGTAAGA 

401 CGCATTTTTC TGCCGATATA CTGTTTGCCT TCCTAATGCT GATGATTTTC 

4 51 GTGCGTTCGT . TCGACACGAA ACAAGAACAC GGTATTTCGC CCAAACCGAC 

501 ATACAGCCGC ATCAAAGCCA ATTATTTCAG CTTCGGTTAT TTTGTCGGAC 

551 GCGTGTTGCC GTATCAGTTG TTTGATTTAA GCAAGATTCC TGTGTTCAAA 

601 CAGCCTGCTC CAAGCAGAAT CGGGCAAGGC AGTATTCAAA ATATCGTCCT 

651 GATTATGGGC GAAAGCGAAA GCGCGGCGCA TTTGAAATTG TTTGGCTACG 

701 GGCGCGAAAC TTCGCCGTTT TTGACCCAGC TTTCGCAAGC CGATTTTAAG 

751 CCGATTGTGA AACAAAGTTA TTCCGCAGGC TTTATGACGG CAGTATCCCT 

801 GCCCAGTTTC TTTAACGTCA TACCGCATGC CAACGGCTTG GAACAAATCA 

851 GCGGCGGCGA TATTGTGGAT AAGTACGACA ACACCATCCA CAAAACCGAC 

901 CAAATGATTC AAACCGTATT CGAGCAGCTG CAAAAGCAGC CTGACGGCAA 

951 CTGGCTGTTT GCCTATACCT CCGATCATGG CCAGTATGTT CGCCAAGATA 

1001 TCTACAATCA AGGCACGGTG CAGCCCGACA GCTATCTCGT GCCGCTGGTG 

1051 TTGTACAGCC CGGATAAGGC CGTGCAACAG GCTGCCAACC AGGCTTTTGC 

1101 GCCTTGCGAG ATTGCCTTCC ATCAGCAGCT TTCAACGTTC CTGATTCACA 

1151 CGTTGGGCTA CGATATGCCG GTTTCAGGTT GTCGCGAAGG CTCGGTAACG 

12 01 GGCAACCTGA TTACGGGTGA TGCAGGCAGC TTGAACATTC GCGACGGCAA 

1251 GGCGGAATAT GTTTATCCGC AATGA 
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This encodes a protein having amino acid sequence [<SEQ ID 308>] (SEP ID NO: 308) : 

1 MKKSLFVLFL YSSLLTASEI AYRFVFGIET LPAAKMAETF ALTFVIAALY 

51 LFARYKAT RL LIAVFFAFSI IANNVH YAVY QSWITGINYW LMLKEITEVG 

101 GAGASMLDKL WLPALWGVLE VMLFCSLAKF RRKTHFSADI LFAFLMLMIF 

5 151 VRSFDTKQEH GISPKPTYSR IKANYFSFGY FVGRVLPYQL FDLSKIPVFK 

201 QPAPSRIGQG SIQNIVLIMG ESESAAHLKL FGYGRETSPF LTQLSQADFK 

251 PIVKQSYSAG FMTAVSLPSF FNVIPHANGL EQISGGDIVD KYDNTIHKTD 

3 01 QMIQTVFEQL QKQPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYLVPLV 

351 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT 

10 4 01 GNLITGDAGS LNIRDGKAEY VYPQ* 

ORF81a (SEP ID NO: 308) and PRF81-1 (SEP ID NP: 306) show 77.9% identity in 524 aa 
overlap: 

10 20 30 40 50 60 

1 5 orf 81a . pep MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAKMAETFALTFVIAALYLFARYKATRL 

MM::: I | || I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I : I I I 
orf 81-1 MKKS FLTLVLYSSLLTASEIAYRFVFGIETLPAAKIAETFALTFVIAALYLFARYKVTRL 

10 20 30 40 50 60 

70 80 90 . 100 110 120 

20 orf 81a . pep LIAVFFAFSI I ANNVHYAVYQSWITGINYWLMLKE I TEVGGAGASMLDKLWLPALWGVLE 

MINI MlllllllllllilHIIIIIII IMIIMMI IIIIMMMIM 

orf 81-1 LIAVFFAFS I IANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPVLWGVLE 

70 80 90 100 110 120 

130 140 150 160 170 180 

25 orf 81a . pep VMLFCSLAKFRRKTHFSADILFAFLMLMIFVRSFDTKQEHGISPKPTYSRIKANYFSFGY 

IIIIIIIIIIIIIIIIIMIIIIIIIIIIIII IMIIIMM llllllllllll I 

or f 8 1 - 1 VMLFCSLAKFRRKTHFSAD I LFAFLMLM I FVRS FDTKQEHGI S PKPTYSRIKANYFS FGY 

130 140 150 160 170 180 

190 200 210 220 230 240 

30 orf 8 la. pep FVGRVLPYQLFDLSKIPVFKQPAPSRIGQGSIQNIVLIMGESESAAHLKLFGYGRETSPF 

I I II I I I I I I I I I I : I | = I I I I I II : I I I I I = I I I I I II I I I I I I I I I I I I I I I I I I I I I 
orf 81-1 FVGRVLPYQLFDLSRIPAFKQPAPSKIGQGSVQNIVLIMGESESAAHLKLFGYGRETSPF 

190 200 210 220 230 240 

250 260 270 280 

35 orf 81a .pep LTQLSQADFKP I VKQS YSAGFMTAVSLPSFFNVI PHANGLEQ I SGGD 

|:| II I I I I I I I I I I M I I I I I I I I I I I I I: I I I I I I I I I I I I 
orf 81-1 LTRLSQADFKPIVKQSYSAGFMTAVSLPSFFNAI PHANGLEQ I SGGDTNMFRLAKEQGYE 

250 260 270 280 290 300 



40 orf 81a. pep 



orf 81-1 TYFYSAQAENEMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQQGKHF 

310 320 330 340 350 360 

290 300 310 320 

45 orf 8 la . pep " IVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLF 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 81-1 IVLHQRGSHAPYGALLQPQDKVFGEADIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLF 

370 380 390 400 410 420 
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330 340 350 360 370 380 

orf 81a . pep AYTSDHGQYVRQD I YNQGTVQPDS YLVPLVLYSPDKAVQQAANQAFAPCE I AFHQQLSTF 

IMMI I 1 1 1 1 1 1 1 1 1 1 1 1 IM I II Mil II Mil! I M II IN I Ml 1 1 II 1 1 1 

orf 81-1 AYTSDHGQYVRQD I YNQGTVQPDS YLVPLVLYSPDKAVQQAANQAFAPCE I AFHQQLSTF 

430 440 450 460 470 480 

390 400 410 420 

orf 81a . pep LIHTLGYDMPVSGCREGSVTGNLITGDAGSLNIRDGKAEYVYPQX 

I MIIIIIIIIMMIIIIIIIII IIIIMIIIIIMMM 

orf 81-1 L I HTLGYDMPVSGCREGS VTGNL I TGDAGS LN I RDGKAE YVYPQX 

490 500 510 520 

Homology with a predicted ORF from N. gonorrhoeae 

The aligned aa sequences of ORF81 fSEO ID NO: 304) and a predicted ORF (ORF81.ng) (SEP ID 
NO: 310) from N. gonorrhoeae of the N- and C-termini show 82.4 % and 97.5% identity in 85 and 
121 overlap, respectively: 



orf 81. pep MKKSFLTLVLYSSLLTASEIAYPLELGIETLPAAKIAETFALTFVIAALYLFARNKVTRL 60 

||||:::| I I I I I I I I I I I I I : : I I I I I I I I I = I I I I I I I 1 = I I I I I I I I I 1 = = I I 
orf 81ng MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAKMAETFALTFMIAALYLFARYKASRL 60 

orf 81. pep LIAVFFAFS I I ANNVHYADYQSWMT 85 

IIIIMIMI Mill MINI 
orf 8 lng LI AVFFAFSMI ANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPALWGVAE 12 0 

// 

orf 81 pep QTVFEQLQKTPDGNWLFAYTSDHGQYVRQD 433 

MINIMI I I I I I I I I I ! I I I I I I I I I 
or f 8 lng ALLQPQDKVFGEAD I VDKYDNT IHKTDQMIQTVFEQLQKQPDGNWLFAYTSDHGQYVRQD 433 

orf 81 . pep I YNQGTVQPDS YLVPLVLYSPDKAVQQAANQAFAPCE I AFHQQLSTFL I HTLGYDMPVSG 4 93 

I i I I I I I I I I I I : I I I I I I I ! I I M I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I 

orf 8 lng IYNQGTVQPDSY I VPLVLYSPDKAVQQAANQAFAPCE I AFHQQLSTFL I HTLGYDMPVSG 4 93 

orf 81 . pep CREGS VTGNL I TGDAGS LN I RDGKAE YVYPQ 524 

II MM MM Ml lllllllhll II lllll- 
or f 8 lng CREGS VTGNL I TGDAGS LN I RNGKAEYVYPQ 524 

The complete length ORF81ng nucleotide sequence [<SEQ ID 309>] (SEP ID NO: 309) is: 



1 ATGAAAAAAT CCCTTTTCGT TCTCTTTCTG TATTCATCCC TACTTACCGC 

51 CAGCGAAATC GCCTATCGCT TTGTATTCGG AATTGAAACC TTACCGGCTG 

101 CAAAAATGGC GGAAACGTTT GCGCTGACAT TTATGATTGC TGCGCTGTAT 

151 CTGTTTGCGC GTTATAAGGC TTCGCGGCTG CTGATTGCGG TGTTTTTCGC 

2 01 GTTCAGCATG ATTGCCAACA ATGTGCATTA CGCGGTTTAT CAAAGCTGGA 
251 TGACGGGTAT TAACTATTGG CTGATGCTGA AAGAGGTTAC CGAAGTCGGC 
301 AGCGCGGGCG CGTCGATGTT GGATAAGTTG TGGCTGCCTG CTTTGTGGGG 

3 51 CGTGGCGGAA GTCATGTTGT TTTGCAGCCT TGCCAAGTTC CGCCGTAAGA 

4 01 CGCATTTTTC TGCCGATATA CTGTTTGCCT TCCTAATGCT GATGATTTTC 
4 51 GTGCGTTCGT TCGACACGAA ACAAGAGCAC GGTATTTCGC CCAAACCGAC 
501 ATACAGCCGC ATCAAAGCCA ATTATTTCAG CTTCGGTTAT TTTGTCGGGC 
551 GCGTGTTGCC GTATCAGTTG TTTGATTTAA GCAAGATCCC TGTGTTCAAA 
601 CAGCCTGCTC CAAGCAAAAT CGGGCAAGGC AGTATTCAAA ATATCGTCCT 
651 GATTATGGGC GAAAGCGAAA GCGCGGCGCA TTTGAAATTG TTTGGTTACG 
701 GGCGCGAAAC TTCGCCGTTT TTAACCCGGC TGTCGCAAGC CGATTTTAAG 
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751 CCGATTGTGA AACAAAGTTA TTCCGCAGGC TTTATGACGG CAGTATCCCT 

801 GCCCAGTTTC TTTAACGTCA TACCGCACGC CAACGGCTTG GAACAAATCA 

851 GCGGCGGCGA TACCAATATG TTCCGCCTCG CCAAAGAGCA GGGCTATGAA 

901 ACGTATTTTT ACAGTGCCCA GGCTGAAAAC CAAATGGCAA TTTTGAACTT 

5 951 AATCGGTAAG AAATGGATAG ACCATCTGAT TCAGCCGACG CAACTTGGCT 

1001 ACGGCAACGG CGACAATATG CCCGATGAGA AGCTGCTGCC GTTGTTCGAC 

1051 AAAATCAATT TGCAGCAGGG CAGGCATTTT ATCGTGTTGC ACCAACGCGG 

1101 TTCGCACGCC CCATACGGCG CATTGTTGCA GCCTCAAGAT AAAGTATTCG 

1151 GCGAAGCCGA TATTGTGGAT AAGTACGACA ACACCATCCA CAAAACCGAC 

10 1201 CAAATGATTC AAACCGTATT CGAGCAGCTG CAAAAGCAGC CTGACGGCAA 

1251 CTGGCTGTTT GCCTATACCT CCGATCATGG CCAGTATGTG CGCCAAGATA 

13 01 TCTACAATCA AGGCACGGTG CAGCCCGACA GCTATATTGT GCCTCTGGTT 

13 51 TTGTACAGCC CGGATAAGGC CGTGCAACAG GCTGCCAACC AGGCTTTTGC 

1401 GCCTTGCGAG ATTGCCTTCC ATCAGCAGCT TTCAACGTTC CTGATTCACA 

15 1451 CGTTGGGCTA CGATATGCCG GTTTCAGGTT GTCGCGAAGG CTCGGTAACA 

1501 GGCAACCTGA TTACGGGCGA TGCAGGCAGC TTGAACATTC GCAACGGCAA 

1551 GGCGGAATAT GTTTATCCGC AATAA 

This encodes a protein having amino acid sequence [<SEQ ID 310>] (SEP ID NO: 310) : 

20 1 MKKSLFVLFL YSSLLTASEI AYRFVFGIET LPAAKMAETF ALTFMIAALY 

51 LFAR YKASRL LIAVFFAFSM IANNVH YAVY QSWMTGINYW LMLKEVTEVG 

101 SAGAS MLDKL WLPALWGVAE VMLFCSLAKF RRKTHFSADI LFAFLMLMIF 

151 VRS FDTKQEH GISPKPTYSR IKANYFSFGY FVGRVLPYQL FDLSKIPVFK 

2 01 QPAPSKIGQG SIQNIVLIMG ESESAAHLKL FGYGRETSPF LTRLSQADFK 
25 251 PIVKQSYSAG FMTAVSLPSF FNVIPHANGL EQISGGDTNM FRLAKEQGYE 

301 TYFYSAQAEN QMAILNLIGK KWIDHLIQPT QLGYGNGDNM PDEKLLPLFD 

3 51 KINLQQGRHF IVLHQRGSHA PYGALLQPQD KVFGEADIVD KYDNTIHKTD 

4 01 QMIQTVFEQL QKQPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYIVPLV 
4 51 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT 

30 501 GNLITGDAGS LNIRNGKAEY VYPQ* 

ORF81ng (SEP ID NO: 310) and ORF81-1 (SEP ID NO: 306) show 96.4% identity in 524 aa 
overlap: 

10 20 30 40 50 60 

35 orf 81ng- 1 . pep MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAKMAETFALTFMIAALYLFARYKASRL 

||||:::| I I I I I I I I I I II I I I I I I I I I I I I I h I I I II I I I : I I I I M I I I I h : I I 
orf 81-1 MKKSFLTLVLYSSLLTASEIAYRFVFGIETLPAAKIAETFALTFVIAALYLFARYKVTRL 

10 20 30 40 50 60 

70 80 90 100 110 120 

40 orf 81ng- 1 . pep LIAVFFAFSM I AlSnWHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPALWGVAE 

1 1 1 1 1 1 1 M i 1 1 1 1 1 1 1 1 1 1 M I ! M 1 1 1 1 1 1 M 1 1 1 M I M 1 1 1 1 1 1 1 M 1 1 I 

orf 81-1 LIAVFFAFS 1 1 ANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPVLWGVLE 

70 80 90 100 110 120 

130 140 150 160 170 180 

45 orf 81ng-l .pep VMLFCSLAKFRRKTHFSAD I LFAFLMLM I FVRSFDTKQEHGISPKPTYSR IKANYFSFGY 

! 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 h 1 1 1 1 M 

orf 81-1 VMLFCSLAKFRRKTHFSADILFAFLMLMIFVRS FDTKQEHGISPKPTYSR IKANYFSFGY 

130 140 150 160 170 180 

190 200 210 220 230 240 

50 orf 81ng- 1 . pep FVGRVLPYQLFDLSKIPVFKQPAPSKIGQGSIQNIVLIMGESESAAHLKLFGYGRETSPF 

MUM II I II MM Mill! MINIMI MINIMUM MINI Mill 
orf 81 - 1 FVGRVLPYQLFDLSRIPAFKQPAPSKIGQGSVQNIVLIMGESESAAHLKLFGYGRETSPF 
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190 200 210 220 230 240 

250 260 270 280 290 300 

orf 81ng- 1 . pep LTRLSQADFKPIVKQSYSAGFMTAVSLPSFFNVIPHANGLEQISGGDTNMFRLAKEQGYE 

1 II I M 1 1 1 ' II 1 1 II I II 1 1 1 1 1 Ml II h 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 Ml 1 1 1 1 M 

5 orf 81 - 1 LTRLSQADFKPIVKQSYSAGFMTAVSLPSFFNAIPHANGLEQISGGDTNMFRLAKEQGYE 

250 260 270 280 290 300 

310 320 330 340 350 360 

orf 81ng-l .pep TYFYSAQAENQMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQQGRHF 
II I I I I II II M II I I I II M I M I I I M I I M II II I I M I M M I I I I I II II I M 
10 orf 81-1 TYFYSAQAENEMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQQGKHF 

310 320 330 340 350 360 

370 380 390 400 410 420 

orf 81ng- 1 . pep I VLHQRGSHAP YGALLQPQDKVFGEAD I VDKYDNT I HKTDQM I QTVFEQLQKQPDGNWLF 
I I M I M I I I I I I II I I I I M II I I I I I I I I I I I I I I I I I I I I I I II M I I I II I M 
15 orf 81-1 I VLHQRGSHAP YGALLQPQDKVFGEAD I VDKYDNT I HKTDQM I QTVFEQLQKQPDGNWLF 

370 380 390 400 410 420 

430 440 450 460 470 480 

orf 81ng-l .pep AYTSDHGQYVRQDIYNQGTVQPDSYIVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTF 

MIMMMMMMIIMM MMMMMMMIMM MIMIIIH IIMM 

20 orf 81-1 AYTSDHGQYVRQDIYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTF 

430 440 450 460 470 480 

.490 500 510 520 

orf 81ng- 1 . pep L I HTLGYDMP VSGCREGS VTGNL I TGDAGS LNI RNGKAE YVYPQX 

MMIIIMI IMIIIIIII MMMMMMMMMIMM 

25 orf 81-1 L I HTLGYDMP VSGCREGS VTGNL I TGD AGS LN I RDGKAE YVYPQX 

490 500 510 520 

Furthermore, ORF81ng (SEP ID NO: 310) shows significant homology to an E.coli OMP (SEP ID 
NP: 1133) : 

30 gi | 1256380 (U50906) outer membrane adherence protein-associated protein [E. coli] 

Length =547 
Score = 87.4 bits (213), Expect = 2e-16 

Identities = 122/468 (26%), Positives = 198/468 (42%), Gaps = 70/468 (14%) 

Query: 25 VFGIETLPAAKMAETFA- LTFMIAALYLFARYKAS - - RLLI AVFFAFSMIANNVHYAVYQ 81 
35 VFGI LA+A LF+++R + RLL+A F + A ++ ++Y 

Sbjct: 29 VFGITNLVASSGAHMVQRLLFFVLTI LWKRI SSLPLRLLVAAPFVL - LTAADMS I SLY - 86 

Query: 82 SWMT GINYWLMLKEVTEVGSAGASMLDKLWLPALWGVAEVMLFCSLAKFRRKT 134 

SW T G ++ + EV A ML ++ P L A + L + 
Sbjct : 87 SWCTFGTTFNDGFAISVLQSDPDEV AKMLG-MYSPYLCAFAFLSLLFLAVI IKYDV 141 

40 Query: 135 HFS AD I LFAFLMLM I FVRS F DTKQEHGISPKPTYSRIKAN- - YFSFGYFVG 183 

+ L+L+ + S D K ++ SP SR +F+ YF 

Sbjct: 142 SLPTKKVTGILLLIVISGSLFSACQFAYKDAKNKNAFSPYILASRFATYTPFFNLNYFAL 201 

Query: 184 RVLPYQ- -LFDLSKIPVFKQPAPSKIGQGSIQNIVLIMGESESAAHLKLFGYGRETSPFL 241 
+Q L + +P F+ + I VLI+GES ++ L+GY R T+P + 

45 Sbjct: 202 AAKEHQRLLS I ANTVPYFQL SVRDTGIDTYVLIVGESVRVDNMSLYGYTRSTTPQV 257 



Query: 



242 



TRLSQADFKPIVKQSYSAGFMTAVSLP S FFNVI PHANGLEQ I SGGDTNMFRLAKEQG 



298 
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+Q + Q+ S TA+S+P + +V+ H I N+ +A + G 

Sbjct: 258 E - - AQRKQ I KLFNQAI SGAP YTALS VPLS LTADS VLSH D I HN Y PDN 1 1 NMANQ AG 310 

Query: 299 YETYFYSAQA ENQMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQ 355 

++T++ S+Q+ +N A+ ++ ++ + Y G DE LLP + Q 
Sbjct: 311 FQTFWLSSQSAFRQNGTAVTS I AMRAMETVYVRGF DELLLPHLSQALQQ 359 

Query: 356 - -QGRHFIVLHQRGSHAPYGALLQPQDKVFGEADIVDK-YDNTIHKTDQMIQTVFEQLQK 412 

Q + IVLH GSH P + VF D D YDN+IH TD ++ VFE L+ 

Sbjct: 360 NTQQKKLIVLHLNGSHEPACSAYPQSSAVFQPQDDQDACYDNSIHYTDSLLGQVFELLK- 418 

Query: 413 QPDGNWLFAYTSDHG QYVRQDIYNQG- -TVQPDSYIVPL- VLYSP 454 

D Y +DHG ++++Y G +Y VP+ + YSP 

Sbjct: 419 --DRRASVMYFADHGLERDPTKKNVYFHGGREASQQAYHVPMFIWYSP 464 

Based on this analysis, including the presence of a putative leader sequence (double-underlined) 
and several putative transmembrane domains (single-underlined) in the gonococcal protein, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 37 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 31 1>] (SEP ID 
NO: 31 H : 

1 . . .ACCCTGCTCC TGTTCATCCC CCTCGTCCTC ACAC . GTGCG GCACACTGAC 

51 CGGCATACTC GCCCaCGGCG GCGGCAAACG CTTTGCCGTC GAACAAGAAC 

101 TCGTCGCCGC ATCGTCCCGC GCCGCCGTCA AAGAAATGGA TTTGTCCGCC 

151 yTAAAAGGAC GCAAAGCCGC CyTTTACGTC TCCGTTATGG GCGACCAAGG 

201 TTCGGGCAAC ATAAGCGGCG GACGCTACTC TATCGACGCA CTGATACGCG 

251 GCGGCTACCA CAACAACCCC GAAAGTGCCA GCCAATACAG CTACCCCGCC 

301 TACGACACTA CCGCCACCAC CAAATCCGAC GCGCTCTCCA GCGTAACCAC 

351 TTCCACATCG CTTTTGAACG CCCCCGCCGC CGyCyTGACG AAAAACAGCG 

401 GACGCAAAGG CGAACGcTCC GCCGGACTGT CCGTCAACGG CACGGGCGAC 

4 51 TACCGCAACG AAACCCTGCT CGCCAACCCC CGCGACGTTT CCTTCCTGAC 

501 CAACCTCATC CAAACCGTCT TCTACCTGCG CGGCATCGAA GTCgTACCGC 

551 CCGrATACGC CGACACCGAC GTATTCGTAA CCGTCGACGT A... 

This corresponds to the amino acid sequence [<SEQ ID 312; ORF83>] (SEP ID NO: 312; 
ORF83) : 



1 . . TLLLFIPLVL TXCGTLTGIL AHGGGKRFAV EQELVAASSR AAVKEMDLSA 

51 LKGRKAAXYV SVMGDQGSGN ISGGRYSIDA LIRGGYHNNP ESATQYSYPA 

101 YDTTATTKSD ALSSVTTSTS LLNAPAAXLT KNSGRKGERS AGLSVNGTGD 

151 YRNETLLANP RDVSFLTNLI QTVFYLRGIE WPPXYADTD VFVTVDV. . 

Further work revealed the complete nucleotide sequence [<SEQ ID 313>] (SEP ID NP: 313) : 



1 ATGAAAACCC TGCTCCTCCT CATCCCCCTC GTCCTCACAG CCTGCGGCAC 
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51 ACTGACCGGC ATACCCGCCC ACGGCGGCGG CAAACGCTTT GCCGTCGAAC 

101 AAGAACTCGT CGCCGCATCG TCCCGCGCCG CCGTCAAAGA AATGGATTTG 

151 TCCGCCCTAA AAGGACGCAA AGCCGCCCTT TACGTCTCCG TTATGGGCGA 

201 CCAAGGTTCG GGCAACATAA GCGGCGGACG CTACTCTATC GACGCACTGA 

251 TACGCGGCGG CTACCACAAC AACCCCGAAA GTGCCACCCA ATACAGCTAC 

301 CCCGCCTACG ACACTACCGC CACCACCAAA TCCGACGCGC TCTCCAGCGT 

351 AACCACTTCC ACATCGCTTT TGAACGCCCC CGCCGCCGCC CTGACGAAAA 

4 01 ACAGCGGACG CAAAGGCGAA CGCTCCGCCG GACTGTCCGT CAACGGCACG 

4 51 GGCGACTACG GCAACGAAAC CCTGCTCGCC AACCCCCGCG ACGTTTCCTT 

501 CCTGACCAAC CTCATCCAAA CCGTCTTCTA CCTGCGCGGC ATCGAAGTCG 

551 TACCGCCCGA ATACGCCGAC ACCGACGTAT TCGTAACCGT CGACGTATTC 

601 GGCACCGTCC GCAGCCGTAC CGAACTGCAC CTCTACAACG CCGAAACCCT 

651 TAAAGCCCAA ACCAAGCTCG AATATTTCGC CGTTGACCGC GACAGCCGGA 

701 AACTGCTGAT TACCCCTAAA ACCGCCGCCT ACGAATCCCA ATACCAAGAA 

751 CAATACGCCC TTTGGACCGG CCCTTACAAA GTCAGCAAAA CCGTCAAAGC 

8 01 CTCAGACCGC CTGATGGTCG ATTTCTCCGA CATTACCCCC TACGGCGACA 

851 CAACCGCCCA AAACCGTCCC GACTTCAAAC AAAACAACGG TAAAAAACCC 

901 GATGTCGGCA ACGAAGTCAT CCGCCGCCGC AAAGGAGGAT AA 



This corresponds to the amino acid sequence [<SEQ ID 314; ORF83-l>] (SEP ID NO: 314; 
ORF83-1): 



1 MKTLLLLIPL VLTA CGTLTG I PAHGGGKRF AVEQELVAAS SRAAVKEMDL 

51 SALKGRKAAL YVSVMGDQGS GNISGGRYSI DALIRGGYHN NPESATQYSY 

101 PAYDTTATTK SDALSSVTTS TSLLNAPAAA LTKNSGRKGE RSAGLSVNGT 

151 GDYRNETLLA NPRDVSFLTN LIQTVFYLRG IEWPPEYAD TDVFVTVDVF 

201 GTVRSRTELH LYNAETLKAQ TKLEYFAVDR DSRKLLITPK TAAYESQYQE 

251 QYALWTGPYK VSKTVKASDR LMVDFSDITP YGDTTAQNRP DFKQNNGKKP 

3 01 DVGNEVIRRR KGG* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N.meningitidis (strain A) 

ORF83 (SEP ID NO: 312) shows 96.4% identity over a '197aa overlap with an ORF (ORF83a) 
(SEP ID NO: 316) from strain A of N. meningitidis: 

10 20 30 40 50 

orf 83 . pep TLLLFIPLVLTX CGTLTGILAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAX 

III :|IMII lllllll I I I I I I I I I I M i M I I II I II I I I I I I I I I I I I I 
orf 83 a MKTLLXLI PLVLTA CGTLTGI PAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL 

10 20 30 40 50 60 



60 70 80 90 100 110 

orf 83 .pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS 

llllll IIMI I I I MIMIIIIIIIIIIIIMMMIIIIIMIIIIIIIM! 

orf 83a YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS 

70 80 90 100 110 120 



120 130 140 150 160 170 

orf 83 . pep TSLLNAPAAXLTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVS FLTNLIQTVFYLRG 

Illllllll I I I M I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I i I I I I I I I 
orf 83a TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVS FLTNLIQTVFYLRG 

130 140 150 160 170 180 
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180 190 
orf 83. pep IEWPPXYADTDVFVTVDV 

Mill IIIIIIIIIIII 

orf 83a IEWPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK 
5 190 200 210 220 230 240 

The complete length ORF83a nucleotide sequence [<SEQ ID 315>] (SEP ID NO; 315) is: 

1 ATGAAAACCC TGCTCNTCCT CATCCCCCTC GTCCTCACAG CCTGCGGCAC 

51 ACTGACCGGC ATACCCGCCC ACGGCGGCGG CAAACGCTTT GCCGTCGAAC 

10 101 AAGAACTCGT CGCCGCATCG TCCCGCGCCG CCGTCAAAGA AATGGACTTG 

151 TCCGCCCTGA AAGGACGCAA AGCCGCCCTT TACGTCTCCG TTATGGGCGA 

201 CCAAGGTTCG GGCAACATAA GCGGCGGACG CTACTCTATC GACGCACTGA 

251 TACGCGGCGG CTACCACAAC AACCCCGAAA GTGCCACCCA ATACAGCTAC 

301 CCCGCCTACG ACACTACCGC CACCACCAAA TCCGACGCGC TCTCCAGCGT 

15 351 AACCACTTCC ACATCGCTTT TGAACGCCCC CGCCGCCGCC CTGACGAAAA 

4 01 ACAGCGGACG CAAAGGCGAA CGCTCCGCCG GACTGTCCGT CAACGGCACG 

451 GGCGACTACC GCAACGAAAC CCTGCTCGCC AACCCCCGCG ACGTTTCCTT 

501 CCTGACCAAC CTCATCCAAA CCGTCTTCTA CCTGCGCGGC ATCGAAGTCG 

551 TACCGCCCGA ATACGCCGAC ACCGACGTAT TCGTAACCGT CGACGTATTC 

20 601 GGCACCGTCC GCAGCCGCAC CGAACTGCAC CTCTACAACG CCGAAACCCT 

651 TAAAGCCCAA ACCAAGCTCG AATATTTCGC CGTTGACCGC GACAGCCGGA 

701 AACTGCTGAT TGCCCCTAAA ACCGCCGCCT ACGAATCCCA ATACCAAGAA 

751 CAATACGCCC TCTGGATGGG ACCTTACAGC GTCGGCAAAA CCGTCAAAGC 

801 CTCAGACCGC CTGATGGTCG ATTTCTCCGA CATCACCCCC TACGGCGACA 

25 851 CAACCGCCCA AAACCGTCCC GACTTCAAAC AAAACAACGG TAAAAAACCC 

901 GATGTCGGCA ACGAAGTCAT CCGCCGCCGC AAAGGAGGAT AA 

This encodes a protein having amino acid sequence [<SEQ ID 316>] (SEP ID NO: 316) : 

1 MKTLLXLIPL VLTA CGTLTG I PAHGGGKRF AVEQELVAAS SRAAVKEMDL 

30 51 SALKGRKAAL YVSVMGDQGS GNISGGRYSI DALIRGGYHN NPESATQYSY 

101 PAYDTTATTK SDALSSVTTS TSLLNAPAAA LTKNSGRKGE RSAGLSVNGT 

151 GDYRNETLLA NPRDVSFLTN LIQTVFYLRG IEWPPEYAD TDVFVTVDVF 

201 GTVRSRTELH LYNAETLKAQ TKLEYFAVDR DSRKLLIAPK TAAYESQYQE 

251 QYALWMGPYS VGKTVKASDR LMVDFSDITP YGDTTAQNRP DFKQNNGKKP 

35 301 DVGNEVIRRR KGG* 

PRF83a (SEP ID NP: 316) and PRF83-1 (SEP ID NP: 314) show 98.4% identity in 313 aa 
overlap: 

10 20 30 40 50 60 

40 orf 83a . pep MKTLLXL I PLVLTACGTLTG I PAHGGGKRFAVEQELVAAS SRAAVKEMDLSALKGRKAAL 

III! I I I II I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 1 1 M 1 1 1 1 
orf 83 - 1 MKTLLLL I PLVLTACGTLTGI PAHGGGKRFAVEQELVAAS SRAAVKEMDLSALKGRKAAL 

10 20 30 40 50 60 

70 80 90 100 110 120 

45 orf 83a .pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS 

1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II I I II 1 1 1 1 1 1 1 1 

orf 83 - 1 YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS 

70 " 80 90 100 110 120 

130 . 140 150 160 170 180 

50 orf 83a . pep TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVS FLTNLIQTVFYLRG 

III Mllllll IIIIIIIMIIMIIIIIMIIIIII IIIMMIMIIIIIIIII 
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orf 83 - 1 TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 83a . pep IEWPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK 

5 1 1 1 1 1 1 1 M 1 1 1 I 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 II I M' 1 1 Ml 1 1 1! 1 1 M 1 1 M I :| I 

orf 83 - 1 IEWPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLITPK 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 83a . pep TAAYESQYQEQYALWMGPYSVGKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKKP 

10 MINIMUM 1 1 : 1 M 1 1 1 1 1 M 1 1 1 M I M 1 1 1 1 M 1 1 II I i M I II 1 1 II , 

orf 83 - 1 TAAYESQYQEQYALWTGPYKVSKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKKP 
250 260 270 280 290 300 

310 

orf 83a . pep DVGNEV I RRRKGGX 

15 I I I I I I I I I I I I II 

orf 83 - 1 DVGNEV I RRRKGGX 

310 

Homology with a predicted ORF from N. gonorrhoeae 

ORF83 (SEP ID NO: 312) shows 94.9% identity over a 197aa overlap with a predicted ORF 
20 (ORF83.ng) (SEP ID NO: 318) from N. gonorrhoeae: 

orf 83 .pep TLLLFIPLVLTXCGTLTGILAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAX ■ 58 

IIIMIMI Illllll IMIIIIIU IIIIIIIIIMIIMIIIM II 

or f 8 3 ng M KTLLLL I PLVLTACGTLTG I PAHGGGKRFAVEQELVAAS SRAAVKEMDLS ALKGRKAAL 6 0 

orf 83 . pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS 118 

25 I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I : I I I : I I I I I I I I I II I I I I I I h I I I I 

orf 83ng YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPDSATRYSYPAYDTTATTKSDALSGVTTS 12 0 

or f 8 3 . pep TSLLNAPAAXLTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG 178 

Illllll I I I: I I I I I I I I I II I I II I I I M I I I I I I I M I I I I I I II I I I I I I I 
or f 8 3 ng TSLLNAPAAALTKNNGRKGERS AGLS VNGTGDYRNETLLANPRDVS FLTNL IQTVFYLRG 180 

30 orf 83. pep IEWPPXYADTDVFVTVDV 197 

I I I I I I IIIIIIIIIM 

orf 83ng IEWPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK 24 0 

The complete length PRF83ng nucleotide sequence [<SEQ ID 317>] (SEPIDNP: 317) is: 

35 1 ATGAAAACCC TGCTCCTCCT CATCCCCCTC GTACTCACCG CCTGCGGCAC 

51 ACTGACCGGC ATACCCGCCC ACGGCGGCGG CAAACGCTTT GCCGTCGAAC 

101 AGGAACTCGT CGCCGCATCG TCCCGCGCCG CCGTCAAAGA AATGGACTTG 

151 TCCGCCCTGA AAGGACGCAA AGCCGCCCTT TACGTCTCCG TTATGGGCGA 

201 CCAAGGTTCG GGCAACATAA GCGGCGGACG CTACTCCATC GACGCACTGA 

40 251 TACGCGGCGG CTACCACAAC AACCCCGACA GCGCCACCCG ATACAGCTAC 

301 CCCGCCTATG ACACTACCGC CACCACCAAA TCCGACGCGC TCTCCGGCGT 

351 AACCACTTCC ACATCGCTTT TGAACGCCCC CGCCGCCGCC CTGACGAAAA 

4 01 ACAACGGACG CAAAGGCGAA CGCTCCGCCG GACTGTCCGT CAACGGCACG 

4 51 GGCGACTACC GCAACGAAAC CCTGCTCGCC AACCCCCGCG ACGTTTCCTT 

45 501 CCTGACCAAC CTCATCCAAA CCGTCTTCTA CCTGCGCGGC ATCGAAGTCG 
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551 TACCGCCCGA ATACGCCGAC ACCGACGTAT TCGTAACCGT CGACGTATTC 
601 GGCACCGTCC GCAGCCGTAC CGAACTGCAC CTCTACAACG CCGAAACCCT 
651 TAAAGCCCAA ACCAAGCTCG AATATTTCGC CGTCGACCGC GACAGCCGGA 
701 AACTGCTGAT TGCCCCTAAA ACCGCCGCCT ACGAATCCCA ATACCAAGAA 
5 751 CAATACGCCC TCTGGATGGG ACCTTACAGC GTCGGCAAAA CCGTCAAAGC 

801 CTCAGACCGC CTGATGGTCG ATTTCTCCGA CATCACCCCC TACGGCGACA 
851 CAACCGCCCA AAACCGTCCC GACTTCAAAC AAAACAACGG TAAAAACCCC 
901 GATGTCGGCA ACGAAGTCAT CCGCCGCCGC AAAGGAGGAT AA 

10 This encodes a protein having amino acid sequence [<SEQ ID 3 1 8>] (SEP ID NO: 318) : 

1 MKTL LLLIPL VLTAC GTLTG IPAHGGGKRF AVEQELVAAS SRAAVKEMDL 
51 SALKGRKAAL YVSVMGDQGS GNISGGRYSI DALIRGGYHN NPDSATRYSY 
101 PAYDTTATTK SDALSGVTTS TSLLNAPAAA LTKNNGRKGE RSAGLSVNGT 
151 GDYRNETLLA NPRDVS FLTN LIQTVFYLRG IEWPPEYAD TDVFVTVDVF 
15 201 GTVRSRTELH LYNAETLKAQ TKLEYFAVDR DSRKLLIAPK TAAYESQYQE 

251 OYALWM GPYS VGKTV KASDR LMVDFSDITP YGDTTAQNRP DFKQNNGKNP 
301 DVGNEVIRRR KGG* 

ORF83ng (SEP ID NO: 318) and ORF83-1 (SEP ID NO: 314) show 97.1% identity in 313 aa 
20 overlap 

10 20 30 40 50 60 

orf 83 - 1 . pep MKTLLLLI PLVLTACGTLTGI PAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL 

1 1 1 M 1 1 1 1 1 1 !! 1 1 I M 1 1 1 1 1 1 1 i I M I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 

orf83ng ■ MKTLLLLI PLVLTACGTLTG I PAHGGGKRFAVEQELVAAS SRAAVKEMDLS ALKGRKAAL 
25 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 83 - 1 . pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m ii i i 1 1 1 1 1 1 1 1 h 1 1 hi 1 1 1 1 1 1 1 1 1 i i n ! I hi I I I 

orf 83ng YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPDSATRYSYPAYDTTATTKSDALSGVTTS 
30 70 80 90 100 110 120 

130 140 150 160 170 180 

orf 83 - 1 . pep TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG 
h i I h I i I I I h I I I I I I I I I I I I: I I I I I I I I I I I I I I II I I I I I I h I I I I I I I I 
orf83ng TSLLNAPAAALTKNNGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG 
35 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 83 - 1 . pep IEWPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLITPK 
i I I I I I I I I I I I I I I I I I I I I h I I I I I h I I I I I I I I I I M I I I I I I I II I hhh 
orf 83ng IEWPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK 
40 190 200 210 220 230 240 

250 260 270 280 290 300 

orf 83 - 1 . pep TAAYESQYQEQYALWTGPYKVSKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKKP 

MM MINIUM MhMIIIIMIIIIIIMIIIIIMMMIIIMM MM 

orf 83ng TAAYESQYQEQYALWMGPYSVGKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKNP 
45 250 260 270 280 290 300 

310 

orf 83 - 1 . pep DVGNE V I RRRKGGX 
lllllllllllll 
o r f 8 3 ng DVGNEV I RRR KGGX 

50 310 
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Based on this analysis, including the presence of a putative ATP/GTP-binding site motif A 
(P-loop) in the gonococcal protein (double-underlined) and a putative prokaryotic membrane 
lipoprotein lipid attachment site (single-underlined), it is predicted that the proteins from 
N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or 
diagnostics, or for raising antibodies. 

Example 38 

The following DNA sequence, believed to be complete, was identified in N. meningitidis [<SEQ ID 
3 1 9>] (SEP ID NO: 319) : 

1 ATGGCAGAGA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT 

51 AAAAATGGTT TCCATGATGG CGAATGATGA AATGTTTAAG CCTGATGAAA 

101 AAGCCATACG CCGTAAAGTA TTTACGAACA TAAAAGGCTT GAAAATACCG 

151 CACACCTACA TAGAAACGGA CGCAAAAAAG CTGCCGAAAT CGACAGATGA 

201 GCAGCTTTCG GCGCATGATA TGTACGAATG GATAAAGAAG CCCGAAAATA 

251 TCGGGTCTAT TGTCATTGTA GATGAAGCTC AAGACGTATG GCCGGCACGC 

3 01 TCGGCAGGTT CAAAAATCCC TGAAAATGTC CAATGGCTGA ATACGCACAG 
351 ACATCAGGGC ATTGATATAT TTGTTTTGAC TCAAGGTCCT AAGCTTCTAG 

4 01 ATCAAAATCT TAGAACGCTT GTACGGAAAC ATTACCACAT CGCTTCAAAC 
451 AAGATGGGTA TGCGTACGCT TTTAGAATGG AAAATATGCG CGGACGATCC 
501 CGTAAAAATG GCATCAAGCG CATTCTCCAG TATCTATACA CTGGATAAAA 
551 AAGTTTATGA CTTGTAysrr TmmGCGGAAG TTCATACCGT AAATAAGGTC 
601 AAGCGGTCAA AGTGGTTTTA CACTCTGCCa GTAATAGTAT TGCTGATTCC 
651 CGTGTTTGTC GGCCTGTCCT ATAAAATGTT GagCaGTTAC GGAAAAAAAC 
701 aGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA GCAGGCAGTA 
751 CTTCCGGATA AAACAGAAGG CGAGCCGGTA AATAACGGCA ACCTTACCGC 
801 AGATATGTTT GTTCCGACAT TGTCCGAaAA ACCCGrAAGC AAGCcgaTTT 
851 ATAACGGTGT AAGGCAGGTA AGAACCTTTG AATATATAGC AGGCTGTATA 
901 GAAGGCGGAA GAACCGGATG CGCCTGCTAT TCGCaTCAAG GGACGGCATt 
951 gaAAGAAGTG ACGGaGTTGA TGTGcgaAgG aCTATGTaAA AAacGGCTTG 

1001 CCGTTTAACC CaTACAAAGA AGAAAGCCAA GGGCAGGAAG TTCAGCAAAG 

1051 CGCGCAgCAA CATTCGGACA GGGCG£CAAG TTGCCACATT GGGCGGAAAA 

1101 CCGTAGCAGA ACCTAATGTA CGATAATTGG GAAGAACGCG GGAAACCGTT 

1151 TGAAGGAATC GGaCGGGGGC GTGGTCGGAT CGGCAAACTG A 

This corresponds to the amino acid sequence [<SEQ ID 320; ORF84>] (SEP ID NO: 320; 
ORF84): 



1 MAEICLITGT PGSGKTLKMV SMMANDEMFK PDEKAIRRKV FTNIKGLKIP 

51 HTYIETDAKK LPKSTDEQLS AHDMYEWIKK PENIGSIVIV DEAQDVWPAR 

101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGP KLLDQNLRTL VRKHYHIASN 

151 KMGMRTLLEW KICADDPVKM ASSAFSSIYT LDKKVYDLYX XAEVHTVNKV 

201 KRSKWFYTLP VIVLLIPVFV GLSYKMLSSY GKKQEEPAAQ ESAATEQQAV 

2 51 LPDKTEGEPV NNGNLTADMF VPTLSEKPXS KPIYNGVRQV RTFEYIAGCI 
301 EGGRTGCACY SHQGTALKEV TELMCKDYVK NGLPFNPYKE ESQGQEVQQS 

3 51 AQQHSDRAQV ATLGGKPXQN LMYDNWEERG KPFEGIGGGV VGSAN* 



Further work revealed the complete nucleotide sequence [<SEQ ID 321 >] (SEP IDNP: 321) : 
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1 


ATGGCAGAGA 


TCTGTTTGAT 


51 


AAAAATGGTT 


TCCATGATGG 


101 


ACGGCATACG 


CCGTAAAGTA 


151 


CACACCTACA 


TAGAAACGGA 


201 


GCAGCTTTCG 


GCGCATGATA 


251 


TCGGGTCTAT 


TGTCATTGTA 


301 


TCGGCAGGTT 


CAAAAATCCC 


351 


ACATCAGGGC 


ATTGATATAT 


401 


ATCAAAATCT 


TAGAACGCTT 


451 


AAGATGGGTA 


TGCGTACGCT 


501 


CGTAAAAATG 


GCATCAAGCG 


551 


AAGTTTATGA 


CTTGTACGAA 


601 


AAGCGGTCAA 


AGTGGTTTTA 


651 


CGTGTTTGTC 


GGCCTGTCCT 


701 


AGGAAGAACC 


CGCAGCACAA 


751 


CTTCCGGATA 


AAACAGAAGG 


801 


AGATATGTTT 


GTTCCGACAT 


851 


ATAACGGTGT 


AAGGCAGGTA 


901 


GAAGGCGGAA 


GAACCGGATG 


951 


GAAAGAAGTG 


ACGGAGTTGA 


1001 


CGTTTAACCC 


ATACAAAGAA 


1051 


GCGCAGCAAC 


ATTCGGACAG 


1101 


GTAGCAGAAC 


CTAATGTACG 


1151 


AAGGAATCGG 


CGGGGGCGTG 



AACCGGCACG CCCGGTTCAG GGAAAACATT 
CGAATGATGA AATGTTTAAG CCTGATGAAA 
TTTACGAACA TAAAAGGCTT GAAAATACCG 
CGCAAAAAAG CTGCCGAAAT CGACAGATGA 
TGTACGAATG GATAAAGAAG CCCGAAAATA 
GATGAAGCTC AAGACGTATG GCCGGCACGC 
TGAAAATGTC CAATGGCTGA ATACGCACAG 
TTGTTTTGAC TCAAGGTCCT AAGCTTCTAG 
GTACGGAAAC ATTACCACAT CGCTTCAAAC 
TTTAGAATGG AAAATATGCG CGGACGATCC 
CATTCTCCAG TATCTATACA CTGGATAAAA 
TCAGCGGAAG TTCATACCGT AAATAAGGTC 
CACTCTGCCA GTAATAGTAT TGCTGATTCC 
ATAAAATGTT GAGCAGTTAC GGAAAAAAAC 
GAATCGGCGG CAACAGAACA GCAGGCAGTA 
CGAGCCGGTA AATAACGGCA ACCTTACCGC 
TGTCCGAAAA ACCCGAAAGC AAGCCGATTT 
AGAACCTTTG AATATATAGC AGGCTGTATA 
CGCCTGCTAT TCGCATCAAG GGACGGCATT 
TGTGCAAGGA CTATGTAAAA AACGGCTTGC 
GAAAGCCAAG GGCAGGAAGT TCAGCAAAGC 
GGCGCAAGTT GCCACATTGG GCGGAAAACC 
ATAATTGGGA AGAACGCGGG AAACCGTTTG 
GTCGGATCGG CAAACTGA 



This corresponds to the amino acid sequence [<SEQ ID 322; ORF84-l>] (SEP ID NO: 322; 
PRF84-1) : 



1 MAEICLITGT PGSGKTLKMV SMMANDEMFK PDENGIRRKV FTNIKGLKIP 

51 HTYIETDAKK LPKSTDEQLS AHDMYEWIKK PENIGSIVIV DEAQDVWPAR 

101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGP KLLDQNLRTL VRKHYHIASN 

151 KMGMRTLLEW KICADDPVKM ASSAFSSIYT LDKKVYDLYE SAEVHTVNKV 

201 KRSKW FYTLP VIVLLIPVFV GL SYKMLSSY GKKQEEPAAQ ESAATEQQAV 

251 LPDKTEGEPV NNGNLTADMF VPTLSEKPES KPIYNGVRQV RTFEYIAGCI 

301 EGGRTGCACY SHQGTALKEV TELMCKDYVK NGLPFNPYKE ESQGQEVQQS 

351 AQQHSDRAQV ATLGGKP*QN LMYDNWEERG KPFEGIGGGV VGSAN* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N .meningitidis (strain A) 

ORF84 (SEP ID NO: 320) shows 93.9% identity over a 395aa overlap with an ORF (ORF84a) 
(SEP ID NO: 324) from strain A of N. meningitidis: 



10 20 30 40 50 60 

orf 84 .pep MAE I CL I TGTPGSGKTLKMVSMMANDEMFKPDEKAI RRKVFTNI KGLKI PHTY I ETDAKK 

I 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 l-l 1 1 1 1 1 1 1 1 1 1 1 1 1 i I i 1 1 1 1 1 1 M 

orf 84a MAE I CLI TGTPGSGKTLKMVSMMANDEMFKPDENGIRRKVFTNI KGLKI PHTYI ETDAKK 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 84 . pep LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I I 
orf 84a LPKSTDEQLSAHDMYEWI KKPENIGS I VIVDEAQDVWPARSAGSKI PENVQWLNTHRHQG 
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70 80 90 100 110 120 

130 140 150 160 170 180 

IDI FVLTQGPKLLDQNLRTLVRKHYHI ASNKMGMRTLLEWKICADDPVKMASSAFSS I YT 

MINIMI I M I M 1 1 i 1 1 1 1 1 M M 1 1 1 i Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I M 1 1 1 1 

IDI FVLTQGS KLLDQNLRTLVRKH YH I ASNKMGMRTLLEWKI CADD PVKMAS SAFSSI YT 
130 140 150 160 170 180 

190 200 210 220 230 240 

LDKKVYDLYXXAEVHTVNKVKRSKW FYTLPVIVLLIPVFVGL SYKMLSSYGKKQEEPAAQ 

MINIMI 1 1 1 1 1 1 1 1 1 M 1 1 1 1 r 1 1 1 1 1 = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

LDKKVYDLYESAEVHTVNKVKRSKW FYTLPVI ILLIPVFVGL SYKMLSSYGKKQEEPAAQ 
190 200 210 220 230 240 

250 260 270 280 290 300 

ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPXSKPIYNGVRQVRTFEYIAGCI 

I I MM: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I II I I I I M I M I I M M 

ESAATEHQAVFQDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCV 
250 260 270 280 290 300 

310 320 330 340 350 360 

EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV 

IT 1 1 i M 1 1 1 1 1 II 1 1 1 : 1 : I IMMIIIIMMIIMMMI I MINI II 

EGGRTGCTCYSHQGTALKEITKEMCKDYARNGLPFNPYKEESQGRDVQQSEQHHSDRPQV 
310 320 330 340 350 360 

370 380 .390 

ATLGGKPXQNLMYDNWEERGKPFEGIGGGWGSANX 

INI! I 1 1 1 1 II h II II 1 1 1 II 1 1 1 1 II I II I 

ATLGGKPWQNLMYDNWQERGKPFEGIGGGWGSANX 
370 380 390 

The complete length ORF84a nucleotide sequence [<SEQ ID 323>] (SEP ID NO: 323) is 



1 ATGGCAGAGA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT 

51 AAAAATGGTT TCCATGATGG CAAACGATGA AATGTTTAAG CCGGATGAAA 

101 ACGGCATACG CCGTAAAGTA TTTACGAACA TCAAAGGCTT GAAGATACCG 

151 CACACCTACA TAGAAACGGA CGCGAAAAAG CTGCCGAAAT CGACAGATGA 

201 GCAGCTTTCG GCGCATGATA TGTACGAATG GATAAAGAAG CCCGAAAATA 

251 TCGGGTCTAT TGTCATTGTA GATGAAGCTC AAGACGTATG GCCGGCACGC 

301 TCGGCAGGTT CAAAAATCCC TGAAAATGTC CAATGGCTGA ATACGCACAG 

351 ACATCAGGGC ATTGATATAT TTGTTTTGAC TCAAGGCTCT AAGCTTCTAG 

4 01 ATCAAAATCT TAGAACGCTT GTACGGAAAC ATTACCACAT CGCTTCAAAC ■ 

4 51 AAGATGGGTA TGCGTACGCT TTTAGAATGG AAAATATGCG CGGACGATCC 

501 CGTAAAAATG GCATCAAGCG CATTCTCCAG TATCTATACA CTGGATAAAA 

551 AAGTTTATGA CTTGTACGAA TCAGCGGAAG TTCATACCGT AAATAAGGTC 

601 AAGCGGTCAA AATGGTTTTA TACTCTGCCA GTAATAATAT TGCTGATTCC 

651 CGTTTTTGTC GGCCTGTCCT ATAAAATGTT AAGTAGTTAT GGAAAAAAAC 

701 AGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA TCAGGCAGTA 

751 TT-TCAGGATA AAACAGAAGG CGAGCCGGTA AACAACGGTA ACCTTACCGC 

801 AGATATGTTT GTTCCGACAT TGTCCGAAAA ACCCGAAAGC AAGCCGATTT 

851 ATAACGGTGT AAGGCAGGTA AGAACCTTTG AATATATAGC AGGCTGTGTA 

901 GAAGGCGGAA GAACCGGATG CACATGCTAT TCGCATCAAG GGACGGCATT 

951 GAAAGAAATT ACAAAGGAAA TGTGCAAGGA TTACGCAAGA AACGGATTGC 

1001 CGTTTAACCC ATATAAAGAA GAAAGCCAAG GGCGGGATGT CCAGCAAAGT 

1051 GAGCAGCACC ATTCGGACAG ACCGCAAGTT GCCACGTTGG GCGGAAAGCC 

1101 GTGGCAAAAT CTTATGTATG ATAATTGGCA GGAGCGCGGA AAACCGTTTG 

1151 AAGGAATCGG CGGGGGCGTG GTCGGATCGG CAAACTGA 

This encodes a protein having amino acid sequence [<SEQ ID 324>] (SEP ID NO: 324) : 



orf 84 .pep 
orf 84a 

orf 84 .pep 
orf 84a 

orf 84 .pep 
orf 84a 

orf 84 .pep 
orf 84a 

orf 84 .pep 
orf 84a 
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1 MAEICLITGT PGSGKTLKMV SMMANDEMFK PDENGIRRKV FTNIKGLKIP 

51 HTYIETDAKK LPKSTDEQLS AHDMYEWIKK PENIGSIVIV DEAQDVWPAR 

101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGS KLLDQNLRTL VRKHYHIASN 

151 KMGMRTLLEW KICADDPVKM ASSAFSSIYT LDKKVYDLYE SAEVHTVNKV 

201 KRSKW FYTLP VIILLIPVFV GL SYKMLSSY GKKQEEPAAQ ESAATEHQAV 

251 FQDKTEGEPV NNGNLTADMF VPTLSEKPES KPIYNGVRQV RTFEYIAGCV 

301 EGGRTGCTCY SHQGTALKEI TKEMCKDYAR NGLPFNPYKE ESQGRDVQQS 

351 EQHHSDRPQV ATLGGKPWQN LMYDNWQERG KPFEGIGGGV VGSAN* 



10 ORF84a (SEP ID NO: 324) and ORF84-1 (SEP ID NO: 322) show 95.2% identity in 395 aa 
overlap: 



10 20 30 40 50 60 

orf 84a. pep MAE I CL I TGTPGSGKTLKMVSMMANDEMFKPDENG I RRKVFTN I KGLKI PHTY I ETDAKK 

1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

15 orf 84-1 MAEICLITGTPGSGKTLKMVSMKANDEMFKPDENGIRRKVFTNIKGLKIPHTYIETDAKK 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 84a . pep LPKSTDEQLSAHDMYEWI KKPENIGS IVI VDEAQDVWPARSAGSKI PENVQWLNTHRHQG 

I I I I I I I I I I I I I I II I I I I I I I I I I I I II I II I I I I I I I I M I I I I II I I I I I II I I I I 
20 orf 84 - 1 LPKSTDEQLSAHDMYEWI KKPENIGS I VI VDEAQDVWPARSAGSKI PENVQWLNTHRHQG 

70 80 90 100 110 120 



25 



130 140 150 160 170 180 

orf 84a .pep IDIFVLTQGSKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT 

MINIMI MM III MUM Ml II II MM M II Ml Ml MM MM II MM 

orf 84 - 1 I D I FVLTQGPKLLDQNLRTLVRKHYH I ASNKMGMRTLLEWKI CADDPVKMAS S AFS S I YT 

130 140 150 160 170 180 



30 



190 200 210 220 230 240 

orf 84a . pep LDKKVYDLYESAEVHTVNKVKRSKWFYTLPVI ILLI PVFVGLSYKMLSSYGKKQEEPAAQ 

M I M M M M M M M M M M I M M M MM M MM M M M M M MMM M 

orf 84 - 1 LDKKVYDLYESAEVHTVNKVKRSKWFYTLPVIVLLI PVFVGLSYKMLSSYGKKQEEPAAQ 

190 200 210 220 230 240 



250 260 270 280 290 300 

orf 84a . pep ESAATEHQAVFQDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCV 

MIMIMIM 1 1 1 1 II I MUM II M M 1 1 M Ml 1 1 II I M 1 1 II II 1 1 1 M M 

35 or f 84 - 1 ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCI 

250 260 270 280 290 300 



310 320 330 340 350 360 

orf 84a . pep EGGRTGCTCYSHQGTALKE I TKEMCKDYARNGLPFNPYKEESQGRDVQQS EQHHSDRPQV 

M 1 1 M I M I II I M 1 1 1 MM II I MM Ml I Ml 1 1 1 1 1 M M 1 1 1 MM II 

40 or f 84 - 1 EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV 

310 320 330 340 350 360 



45 



370 380 390 

orf 84a . pep ATLGGKPWQNLMYDNWQERGKPFEGIGGGWGSANX 

Mill I I I I 1 I I I : I 1 i I I 1 i I I III ! I I 1 I I 
or f 84 - 1 ATLGGKPXQNLMYDNWEERGKPFEGIGGGWGSANX 

370 380 390 



Homology with a predicted ORF from N. gonorrhoeae 
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ORF84 fSEO ID NO: 320) shows 94.2% identity over a 395aa overlap with a predicted ORF 
(ORF84.ng) (SEP ID NO: 326) from N. gonorrhoeae: 

orf 84 .pep MAEICLITGTPGSGKTLKMVSMMANDEMFKPDEKAIRRKVFTNIKGLKIPHTYIETDAKK 60 

lllilll MM III MMMMIMI III MhMI MIMMIMI IMMM MM 
5 orf84ng MAE I CLI TGTPGSGKTLKMVSMMANDEMFKPDENGVRRKVFTNI KGLKI PHTH I ETDAKK 60 

orf 84 .pep LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG 120 

II 1 1 M I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M II I II 1 1 1 II I M I II I II 1 1 1 1 1 1 II 1 1 1 1 1 

orf84ng LPKSTDEQLSAHDMYEWIKKPENVGAIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG 120 

orf 84 .pep IDIFVLTQGPKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT 180 

10 I I 1 I I I M I I I I I I I I I I I I I = I I I I I = I I I I : I I I I 1 I I : I i I I I I I I Ill 

orf84ng I DI FVLTQGPKLLDQNLRTLVKRHYH I AANKMGLRTLLEWKVCADDPVKMAS S AFSS I YT 180 

orf 84. pep LDKKVYDLYXXAEVHTVNKVKRSKWFYTLPVIVLLIPVFVGLSYKMLSSYGKKQEEPAAQ 240 

MINIMI I !: I I I I [ I I I M I I I : I I I I : I I II :! I I I I M I I : I I I I I I I I I I I I 
orf84ng LDKKVYDLYESAEIHTVNKVKRSKWFYALPVIILLIPLFVGLSYKMLGSYGKKQEEPAAQ 240 

15 orf 84. pep ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPXSKPIYNGVRQVRTFEYIAGCI 300 

lllllllllllll III lllilll MINI III M 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 

orf 84ng ESAATEQQAVLPDKTEGESVNNGNLTADMFVPTLPEKPESKPIYNGVRQVRTFEYIAGCI 300 

orf 84 . pep EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV 360 

I I I I II I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
20 or f 84ng EGGRTGCTCYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV 360 

orf 84 . pep ATLGGKPXQNLMYDNWEERGKPFEGIGGGWGSAN 395 

lllilll I I I I i I I I M M I I II I I I I I I I I M 
or f 8 4 ng ATLGGKPQQNLM YDNWEERGKP FEG I GGG WGS AN 395 

25 The complete length ORF84ng nucleotide sequence [<SEQ ID 325>] (SEP ID NO: 325) is: 

1 ATGGCAGAAA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT 

51 AAAAATGGTT TCCATGATGG CAAACGATGA AATGTTTAAG CCAGATGAAA 

101 ACGGCGTACG CCGTAAAGTA TTTACGAACA TCAAAGGTTT GAAGATACCG 

151 CACACCCACA TAGAAACAGA CGCAAAGAAG CTGCCGAAAT CAACCGATGA 

30 201 ACAGCTTTCG GCGCATGATA TGTATGAATG GATCAAGAAG CCTGAAAacg 

251 tcggcgCAAT CGTTATTGTC GATGAGGCGC AAGACGTATG GCCCGCACGC 

301 TccgCAGGTT CGAAAATCCC CGAAAACGTC CAATGGCTGA ACACACACAG 

351 GCATCAGGGC ATAGATATAT TTGTATTGAC ACAAGGTCCT AAACTCTTAG 

4 01 ATCAGAACTT GCGAACATTG GTTAAAAGAC ATTACCACAT TGCGGCCAAC 

35 4 51 AAAATGGGTT TGCGTACCCT GCTTGAATGG AAAGTATGCG CGGATGACCC 

501 GGTAAAAATG GCATCAAGTG CATTTTCCAG TATCTACACA CTGGATAAAA 

551 AAGTTTATGA CTTGTACGAA TCCGCAGAAA TTCACACGGT AAACAAAGTC 

601 AAGCGTTCAA AATGGTTTTA TGCATTGCCC GTCATCATAT TATTGATTCC • 

651 GCTATTTGTC GGTTTGTCTT ACAAAATGTT GGGCAGTTAC GGAAAAAAAC 

40 701 AGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA GCAGGCAGTA 

751 CTTCCGGATA AAACAGAAGG AGAATCGGTG AATAACGGAA ACCTTACGGC 

801 AGATATGTTT GTTCCGACAT TGCCCGAAAA ACCCGAAAGC AAGCCGATTT 

851 ATAACGGTGT AAGGCAGGTA AGGACCTTTG AATATATAGC AGGCTGTATA 

901 GAAGGCGGAA GAACCGGATG CACCTGCTAT TCGCATCAAG GGACGGCATT 

45 951 GAAAGAAGTG ACGGAGTTGA TGTGCAAGGA CTATGTAAAA AACGGCTTGC 

1001 CGTTTAACCC ATACAAAGAA GAAAGCCAAG GGCAGGAAGT TCAGCAAAGC 

1051 GCGCAGCAAC ATTCGGACAG GGCGCAAGTT GCCACCTTGG GCGGAAAACC 

1101 GCAGCAGAAC CTAATGTACG ACAATTGGGA AGAACGCGGG AAACCGTTTG 

1151 AAGGAATCGG CGGGGGCGTG GTCGGATCGG CAAACTGA 
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This encodes a protein having amino acid sequence [<SEQ ID 326>] (SEP ID NO: 326) : 



i 

51 
101 
151 
201 
251 
301 



MAEICLITGT 
HTHIETDAKK 
SAGSKIPENV 
KMGLRTLLEW 
KRSKWFYALP 



PGSGKT LKMV 
LPKSTDEQLS 
QWLNTHRHQG 
KVCADDPVKM 
VIILLIPLFV 



10 



LPDKTEGESV 
EGGRTGCTCY 
351 AQQHSDRAQV 



NNGNLTADMF 
SHQGTALKEV 
ATLGGKPQQN 



SMMANDEMFK 
AHDMYEWIKK 
IDIFVLTQGP 
ASSAFSSIYT 
GLSYKMLGSY 
VPTLPEKPES 
TELMCKDYVK 
LMYDNWEERG 



PDENGVRRKV 
PENVGAIVIV 
KLLDQNLRTL 
LDKKVYDLYE 
GKKQEEPAAQ 
KPIYNGVRQV 
NGLPFNPYKE 
KPFEGIGGGV 



FTNIKGLKIP 
DEAQDVWPAR 
VKRHYHIAAN 
SAEIHTVNKV 
ESAATEQQAV 
RTFEYIAGCI 
ESQGQEVQQS 
VGSAN* 



ORF84ng (SEP ID NO: 326) and ORF84-1 (SEP ID NO: 322) show 95.4% identity in 395 aa 
overlap: 



15 



10 20 30 40 50 60 

orf 84-1. pep MAEICLITGTPGSGKTLK^SMMANDEMFKPDENGIRRICVFTNIKGLKIPHTYIETDAKK 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 hi II 1 1 1 1 1 1 1 i 1 1 1 M : 1 1 1 II M 

orf 84ng MAE I CL ITGTPGSGKTLK^SMMANDEMFKPDENGVRRKVFTNI KGLKI PHTHI ETDAKK 

10 20 30 40 50 60 



20 



70 80 90 100 110 120 

orf 84 - 1 . pep LPKSTDEQLSAHDM YEW I KKPEN I GS IV I VDEAQDVWPARS AGS KI PENVQWLNTHRHQG 

1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 hhl 1 1 1 1 M 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 

orf 84ng LPKSTDEQLSAHDM YEW I KKPENVGAI VI VDEAQDVWPARSAGSKI PENVQWLNTHRHQG 

70 80 90 100 110 120 



25 



130 140 150 160 170 180 

orf 84 - 1 . pep I D I FVLTQGPKLLDQNLRTLVRKHYHI ASNKMGMRTLLEWKI CADDPVKMAS S AFS S I YT 
I I 1 M I I I I I I I I I I I I I I I -M I I h I I I h I M I I I h I I I I I I I I I I I M I M 
orf 84ng IDIFVLTQGPKLLDQNLRTLVKRHYHIAANKMGLRTLLEWKVCADDPVKMASSAFSSIYT 

130 140 150 160 170 180 



30 



190 200 210 220 230 240 

orf 84 - 1 . pep LDKKVYDLYESAEVHTVNKVKRSKWFYTLPVIVLLIPVFVGLSYKMLSSYGKKQEEPAAQ 

IIIIIIMIIIIMIIIIIIIIIIIMIIMIIMIIIIII hi I I I Ml. 

orf 84ng LDKKVYDLYESAEIHTVNKVKRSKWFYALPVIILLIPLFVGLSYKMLGSYGKKQEEPAAQ 

190 200 210 220 230 240 



35 



250 260 270 280 290 300 

orf 84 - 1 . pep ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCI 

1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 MINIMI MINI III ill I I I I I I Ml 

orf 84ng ESAATEQQAVLPDKTEGESVNNGNLTADMFVPTLPEKPESKPIYNGVRQVRTFEYIAGCI 

250 260 270 280 290 300 



40 



310 320 330 340 350 360 

orf 84 - 1 . pep EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV 

1 1 1 1 1 1 1 H 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 

orf 84ng EGGRTGCTCYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV 

310 320 330 340 350 360 



45 



370 380 390 

orf 84 - 1 . pep ATLGGKPXQNLMYDNWEERGKPFEG I GGGWGS ANX 

null 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii m i ii 1 1 1 1 1 ; 

orf 84ng ATLGGKPQQNLMYDNWEERGKP FEG IGGGWGS ANX 

370 380 390 
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Based on this analysis, includng the presence of a putative transmembrane domain (single- 
underlined) in the gonococcal protein, and a putative ATP/GTP-binding site motif A (P-loop, 
double-underlined), it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 39 

The following partial DNA sequence was identified in N .meningitidis [<SEQ ID 327>] (SEP ID 
NO: 327) : 

1 GTGGTTTTCC TGAATGCCGA CAACGGGATA TTGGTTCAGG ACTTGCCTTT 

51 TGAAGTCAAA CTGAAAAAAT TCCATATCGA TTTTTACAAT ACGGGTATGC 

101 CGCGTGATTT CGCCAGCGAT ATTGAAGTGA CGGACAAGGC AACCGGTGAG 

151 AAACTCGAGC GCACCATCCG CGTGAACCAT CCTTTGACCT TGCACGGCAT 

201 CACGATTTAT CAGGCGAGTT TTGCCGACGG CGGTTCGGAT TTGACATTCA 

251 AGGCGTGGAA TTTGGGTGAT GCTTCGCGCG AGCCTGTCGT GTTGAAGGCA 

301 ACATCCATAC ACCAGTTTCC GTTGGAAATT GGCAAACACA AATATCGTCT 

351 TGAGTTCGAT CAGTTCACTT CTATGAATGT GGAGGACATG AGCGAGGGCG 

4 01 CGGAACGGGA AAAAAGCCTG AAATCCACGC TGCCCGATGT CCGCGCCGTT 

4 51 ACTCAGGAAG GTCACAAATA CACCAAT TACCG 

501 TATCCGTGAT GCGCCAGGCC AGGCGGTCGA ATATAAAAAC TATATGCTGC 

551 CGGTTTTGCA GGAACAGGAT TATTTTTGGA TTACCGGCAC GCGCAGCGC . 

601 TTGCAGCAGC AATACCGCTG GCTGCGTATC CCCTTGGACA AGCAGTTGAA 

651 AGCGGACACC TTTATGGCAT TGCGTGAGTT TTTGAAAGAT GGGGAAGGGC 

701 GCAAACGTCT . GTTGCCGAC GCAACCAAAG GCGCACCTGC CGAAATCCGC 

751 GAACAATTCA TGCTGGCTGC GGAAAACACG CTGAACATCT TTGCACAAAA 

801 AGGCTATTTG GGATTGGACG AATTTATTAC GTCCAATATC CCGAAAGAGC 

851 AGCAGGATAA GATGCAGGGC TATTTCTACG AAATGCTTTA CGGCGTGATG 

901 AACGCTGCTT TGGATGAAAC CAT.ACCCGG TACGGCTTGC CCGAATGGCA 

951 GCAGGATGAA GCGCGGAATC GTTTCCTGCT GCACAGTATG GATGCGTACA 

1001 CGGGTTTGAC CGAATATCCC GCGCCTATGC TGCTGCAACT TGATGGGTTT 

1051 TCCGAGGTGC GTTCGTCGGG TTTGCAGATG ACCCGTTCCC C.GGTCCGCT 

1101 TTTGGTCTAT CTC . . . 

This corresponds to the amino acid sequence [<SEQ ID 328; ORF88>] (SEP ID NO: 328; 
PRF88) : 



1 MVFLNADNGI LVQDLPFEVK LKKFHIDFYN TGMPRDFASD I EVTDKATGE 

51 KLERTIRVNH PLTLHGITIY QAS FADGGSD LTFKAWNLGD ASREPWLKA 

101 TSIHQFPLEI GKHKYRLEFD QFTSMNVEDM SEGAEREKSL KSTLPDVRAV 

151 TQEGHKYTNX XXXXXYRIRD APGQAVEYKN YMLPVLQEQD YFWITGTRSX 

2 01 LQQQYRWLRI PLDKQLKADT FMALREFLKD GEGRKRXVAD ATKGAPAEIR 
251 EQFMLAAENT LNI FAQKGYL GLDEFITSNI PKEQQDKMQG YFYEMLYGVM 

3 01 NAALDETXTR YGLPEWQQDE ARNRFLLHSM DAYTGLTEYP APMLLQLDGF 
351 SEVRSSGLQM TRSXGPLLVY L . . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 329>] (SEP ID NP: 329) : 



1 ATGAGTAAAT CCCGTAGATC TCGCCCACTT CTTTCCCGTC CGTGGTTCGC 
51 TTTTTTCAGC TCCATGCGCT TTGCAGTCGC TTTGCTCAGT CTGCTGGGTA 
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101 TTGCATCGGT TATCGGTACG GTGTTGCAGC AAAACCAGCC GCAGACGGAT 

151 TATTTGGTCA AATTCGGATC GTTTTGGGCG CAGATTTTTG GTTTTCTGGG 

201 ACTGTATGAC GTCTATGCTT CGGCATGGTT TGTCGTTATC ATGATGTTTT 

251 TGGTGGTTXC TACCAGTTTG TGCCTGATTC GCAATGTGCC GCCGTTCTGG 

301 CGCGAAATGA AGTCTTTTCG GGAAAAGGTT AAAGAAAAAT CTCTGGCGGC 

351 GATGCGCCAT TCTTCGCTGT TGGATGTAAA AATTGCGCCC GAGGTTGCCA 

4 01 AACGTTATCT GGAAGTACAA GGTTTTCAGG GAAAAACCAT TAACCGTGAA 

451 GACGGGTCGG TTCTGATTGC CGCCAAAAAA GGCACAATGA ACAAATGGGG 

501 CTATATCTTT GCCCATGTTG CTTTGATTGT CATTTGCCTG GGCGGGTTGA 

551 TAGACAGTAA CCTGCTGTTG AAACTGGGTA TGCTGACCGG TCGGATTGTT 

601 CCGGACAATC AGGCGGTTTA TGCCAAGGAT TTCAAGCCCG AAAGTATTTT 

651 GGGTGCGTCC AATCTCTCAT TTAGGGGCAA CGTCAATATT TCCGAGGGGC 

701 AGAGTGCGGA TGTGGTTTTC CTGAATGCCG ACAACGGGAT ATTGGTTCAG 

751 GACTTGCCTT TTGAAGTCAA ACTGAAAAAA TTCCATATCG ATTTTTACAA 

801 TACGGGTATG CCGCGTGATT TCGCCAGCGA TATTGAAGTG ACGGACAAGG 

851 CAACCGGTGA GAAACTCGAG CGCACCATCC GCGTGAACCA TCCTTTGACC 

901 TTGCACGGCA TCACGATTTA TCAGGCGAGT TTTGCCGACG GCGGTTCGGA 

951 TTTGACATTC AAGGCGTGGA ATTTGGGTGA TGCTTCGCGC GAGCCTGTCG 

1001 TGTTGAAGGC AACATCCATA CACCAGTTTC CGTTGGAAAT TGGCAAACAC 

1051 AAATATCGTC TTGAGTTCGA TCAGTTCACT TCTATGAATG TGGAGGACAT 

1101 GAGCGAGGGC GCGGAACGGG AAAAAAGCCT GAAATCCACG CTGAACGATG 

1151 TCCGCGCCGT TACTCAGGAA GGTAAAAAAT ACACCAATAT CGGCCCTTCC 

12 01 ATTGTTTACC GTATCCGTGA TGCGGCAGGG CAGGCGGTCG AATATAAAAA 

12 51 CTATATGCTG CCGGTTTTGC AGGAACAGGA TTATTTTTGG ATTACCGGCA 
1301 CGCGCAGCGG CTTGCAGCAG CAATACCGCT GGCTGCGTAT CCCCTTGGAC 

13 51 AAGCAGTTGA AAGCGGACAC CTTTATGGCA TTGCGTGAGT TTTTGAAAGA 

14 01 TGGGGAAGGG CGCAAACGTC TGGTTGCCGA CGCAACCAAA GGCGCACCTG 
14 51 CCGAAATCCG CGAACAATTC ATGCTGGCTG CGGAAAACAC GCTGAACATC 
1501 TTTGCACAAA AAGGCTATTT GGGATTGGAC GAATTTATTA CGTCCAATAT 
1551 CCCGAAAGAG CAGCAGGATA AGATGCAGGG CTATTTCTAC GAAATGCTTT 
16 01 ACGGCGTGAT GAACGCTGCT TTGGATGAAA CCATACGCCG GTACGGCTTG 
1651 CCCGAATGGC AGCAGGATGA AGCGCGGAAT CGTTTCCTGC TGCACAGTAT 
1701 GGATGCGTAC ACGGGTTTGA CCGAATATCC CGCGCCTATG CTGCTGCAAC 
1751 TTGATGGGTT TTCCGAGGTG CGTTCGTCGG GTTTGCAGAT GACCCGTTCC 
1801 CCGGGTGCGC TTTTGGTCTA TCTCGGCTCG GTGCTGTTGG TATTGGGTAC 
1851 GGTATTGATG TTTTATGTGC GCGAAAAACG GGCGTGGGTA TTGTTTTCAG 
1901 ACGGCAAAAT CCGTTTTGCC ATGTCTTCGG CCCGCAGCGA ACGGGATTTG 
1951 CAGAAGGAAT TTCCAAAACA CGTCGAGAGT CTGCAACGGC TCGGCAAGGA 
2 001 CTTGAATCAT GACTGA 

This corresponds to the amino acid sequence [<SEQ ID 330; ORF88-l>] (SEP ID NO: 330; 
ORF88-1) : 



1 MSKSRRSPPL LSRPWFAFFS SMRF AVALLS LLGIASVIGT VL QQNQPQTD 

51 YLVKFGSFWA Q I FGFLGLYD VYASAW FWI MMFLWSTSL CLI RNVPPFW 

101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVQ GFQGKTINRE 

151 DGSVLIAAKK GTMNKWG Y I F AHVALIVICL GGLI DSNLLL KLGMLTGRIV 

201 PDNQAVYAKD FKPESILGAS NLSFRGNVNI SEGQSADWF LNADNGILVQ 

251 DLPFEVKLKK FHIDFYNTGM PRDFASDIEV TDKATGEKLE RTIRVNHPLT 

301 LHGITIYQAS FADGGSDLTF KAWNLGDASR EPWLKATSI HQFPLEIGKH 

351 KYRLEFDQFT SMNVEDMSEG AEREKSLKST LNDVRAVTQE GKKYTNIGPS 

4 01 IVYRIRDAAG QAVEYKNYML PVLQEQDYFW ITGTRSGLQQ QYRWLRIPLD 

451 KQLKADTFMA LREFLKDGEG RKRLVADATK GAPAEIREQF MLAAENTLNI 

501 FAQKGYLGLD EFITSNIPKE QQDKMQGYFY EMLYGVMNAA LDETIRRYGL 

551 PEWQQDEARN RFLLHSMDAY TGLTEYPAPM LLQLDGFSEV RSSGLQMTRS 

601 PGA LLVYLGS VLLVLGTVLM FYVREKRAWV LFSDGKIRFA MSSARSERDL 

651 QKEFPKHVES LQRLGKDLNH D* 



Computer analysis of this amino acid sequence gave the following results: 
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Homology with a predicted ORF from N. meningitidis (strain A) 

ORF88 (SEP ID NO: 328) shows 95.7% identity over a 371 aa overlap with an ORF (ORF88a) 
(SEP ID NO: 332) from strain A of N. meningitidis: 

10 20 30 

orf 88 .pep MVFLNADNG I LVQDLPFEVKLKKFH IDFYN 

:|IIIIMI IIIIIIIIIIIIIMMIIII 
orf 88a AKDFKPESILGASNLSFRGNVNISEGQSADWFLNADNGILVQDLPFEVKLKKFHIDFYN 
210 220 230 240 250 260 



10 



40 50 60 70 80 90 

orf 88 . pep TGMPRDFASD I EVTDKATGEKLERT IRVNHPLTLHGITI YQAS FADGGSDLTFKAWNLGD 

II 1 1 1 i I Ml I M 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 II M 1 1 1 1 

orf 88a TGMPRDFASD I EVTDKATGEKLERT I RVNHPLTLHG I T I YQAS FADGGSDLTFKAWNLGD 

270 280 290 300 310 320 



100 110 120 130 140 150 

1 5 orf 88 . pep ASREPWLKATSIHQFPLEIGKHKYRLEFDQFTSMNVEDMSEGAEREKSLKSTLPDVRAV 

I I I I I I I I I II I I I I I I , I I I I I II I I II II II I I .1 I I I I I I I I I I I I I I Mill 
orf 88a ASREPWLKATSIHQFPLEIGKHKYRLEFDQFTSMNVEDMSEGAEREKSLKSTLNDVRAV 
330 340 350 360 370 380 



160 170 180 190 200 210 

20 orf 88 . pep TQEGHKYTNXXXXXXYRIRDAPGQAVEYKNYMLPVLQEQDYFWITGTRSXLQQQYRWLRI 

IMIMI llllll 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 II I II I III Mill 

orf 88a TQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYMLPVLQEQDYFWITGTRSGLQQQYRWLRI 
390 400 410 420 430 440 



220 230 240 250 260 270 

25 orf 88 .pep PLDKQLKADTFMALREFLKDGEGRKRXVADATKGAPAEIREQFMLAAENTLNIFAQKGYL 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II I II II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 

or f 8 8a PLDKQLKADTFMALREFLKDGEGRKRLVADATKGAPAE IREQFMLAAENTLNI FAQKGYL 

450 460 470 480 490 500 



280 290 300 310 320 330 

30 orf 88 .pep GLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAALDETXTRYGLPEWQQDEARNRFLLHSM 

I I I I I I II I I II II I II I I II I I II I I I I I I I I I I I I I I 1 I 1 I I I I 1 I I I I 1 I 1 I I I I 
orf 88a GLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAALDETIRRYGLPEWQQDEARNRFLLHSM 

510 520 530 540 550 560 



340 350 360 370 

35 orf 88 .pep DAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRSXGP LLVYL 

: I I I I I I I I i I I I I I I I I II I I I I I I I M M | Mill 
or f 8 8 a DA YTGLTE Y PAPMLLQLDGFS E VRS S GLQMTRS PGA LLVYLGSVLLVLGTVLM FYVREKR 

570 580 590 600 610 620 

orf 88a AWVLFSDGKIRFAMSSARSERDLQKEFPKHVESLQRLGKDLNHDX 
40 630 640 650 660 670 

The complete length ORF88a nucleotide sequence [<SEQ ID 33 1>] (SEP ID NO: 331) is: 



45 



i 

51 
101 



ATGAGTAAAT CCCGTAGATC TCCCCCACTT CTTTCCCGTC CGTGGTTCGC 
TTTTTTCAGC TCCATGCGCT TTGCGGTCGC TTTGCTCAGT CTGCTGGGTA 
TTGCATCGGT TATCGGTACG GTGTTGCAGC AAAACCAGCC GCAGACGGAT 
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151 TATTTGGTCA AATTCGGATC GTTTTGGGCG CAGATTTTTG GTTTTCTGGG 

2 01 ACTGTATGAC GTCTATGCTT CGGCATGGTT TGTCGTTATC ATGATGTTTT 
251 TGGTGGTTTC TACCAGTTTG TGCCTGATTC GCAATGTGCC GCCGTTCTGG 

3 01 CGCGAAATGA AGTCTTTTCG GGAAAAGGTT AAAGAAAAAT CTCTGGCGGC 
5 3 51 GATGCGCCAT TCTTCGCTGT TGGATGTAAA AATTGCGCCC GAGGTTGCCA 

4 01 AACGTTATCT GGAAGTACAA GGTTTTCAGG GAAAAACCAT TAACCGTGAA 
451 GACGGGTCGG TTCTGATTGC CGCCAAAAAA GGCACAATGA ACAAATGGGG 
501 CTATATCTTT GCCCATGTTG CTTTGATTGT CATTTGCCTG GGCGGGTTGA 
551 TAGACAGTAA CCTGCTGTTG AAACTGGGTA TGCTGACCGG TCGGATTGTT 

10 601 CCGGACAATC AGGCGGTTTA TGCCAAGGAT TTCAAGCCCG AAAGTATTTT 

651 GGGTGCGTCC AATCTCTCAT TTAGGGGCAA CGTCAATATT TCCGAGGGGC 

701 AGAGTGCGGA TGTGGTTTTC CTGAATGCCG ACAACGGGAT ATTGGTTCAG 

751 GACTTGCCTT TTGAAGTCAA ACTGAAAAAA TTCCATATCG ATTTTTACAA 

801 TACGGGTATG CCGCGCGATT TTGCCAGTGA TATTGAAGTA ACGGATAAGG 

15 851 CAACCGGTGA GAAACTCGAG CGCACCATCC GCGTGAACCA TCCTTTGACC 

901 TTGCACGGCA TCACGATTTA TCAGGCGAGT TTTGCCGACG GCGGTTCGGA 

951 TTTGACATTC AAGGCGTGGA ATTTGGGTGA TGCTTCGCGC GAGCCTGTCG 

1001 TGTTGAAGGC AACATCCATA CACCAGTTTC CGTTGGAAAT TGGCAAACAC 

1051 AAATATCGTC TTGAGTTCGA TCAGTTTACT TCTATGAATG TGGAGGACAT 

20 1101 GAGCGAGGGC GCGGAACGGG AAAAAAGCCT GAAATCCACG CTGAACGATG 

1151 TCCGCGCCGT TACTCAGGAA GGTAAAAAAT ACACCAATAT CGGCCCTTCC 

1201 ATTGTTTACC GTATCCGTGA TGCGGCAGGG CAGGCGGTCG AATATAAAAA 

1251 CTATATGCTG CCGGTTTTGC AGGAACAGGA TTATTTTTGG ATTACCGGCA 

1301 CGCGCAGCGG CTTGCAGCAG CAATACCGCT GGCTGCGTAT CCCCTTGGAC 

25 13 51 AAGCAGTTGA AAGCGGACAC CTTTATGGCA TTGCGTGAGT TTTTGAAAGA 

1401 TGGGGAAGGG CGCAAACGTC TGGTTGCCGA CGCAACCAAA GGCGCACCTG 

1451 CCGAAATCCG CGAACAATTC ATGCTGGCTG CGGAAAACAC GCTGAACATC 

1501 TTTGCACAAA AAGGCTATTT GGGATTGGAC GAATTTATTA CGTCCAATAT 

1551 CCCGAAAGAG CAGCAGGATA AGATGCAGGG CTATTTCTAC GAAATGCTTT 

30 1601 ACGGCGTGAT GAACGCTGCT TTGGATGAAA CCATACGCCG GTACGGCTTG 

1651 CCCGAATGGC AGCAGGATGA AGCGCGGAAT CGTTTCCTGC TGCACAGTAT 

1701 GGATGCGTAC ACGGGTTTGA CCGAATATCC CGCGCCTATG CTGCTGCAAC 

1751 TTGATGGGTT TTCCGAGGTG CGTTCGTCGG GTTTGCAGAT GACCCGTTCC 

1801 CCGGGTGCGC TTTTGGTCTA TCTCGGCTCG GTGCTGTTGG TATTGGGTAC 

35 1851 GGTATTGATG TTTTATGTGC GCGAAAAACG GGCGTGGGTA TTGTTTTCAG 

1901 ACGGCAAAAT CCGTTTTGCC ATGTCTTCGG CCCGCAGCGA ACGGGATTTG 

1951 CAGAAGGAAT TTCCAAAACA CGTCGAGAGT CTGCAACGGC TCGGCAAGGA 

2001 CTTGAATCAT GACTGA 

40 This encodes a protein having amino acid sequence [<SEQ ID 332>] (SEP ID NO: 332) : 



1 MSKSRRSPPL LSRPWFAFFS SMRFA VALLS LLGIASVIGT VL QQNQPQTD 

51 YLVKFGSFWA QIFGFLGLYD VYASAW FWI MMFLWSTSL CLI RNVPPFW 

101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVQ GFQGKTINRE 

151 DGSVLIAAKK GTMNKWG YIF AHVALIVICL GGLI DSNLLL KLGMLTGRIV 

45 201 PDNQAVYAKD FKPESILGAS NLSFRGNVNI SEGQSADWF LNADNGILVQ 

251 DLPFEVKLKK FHIDFYNTGM PRDFASDIEV TDKATGEKLE RTIRVNHPLT 

3 01 LHGITIYQAS FADGGSDLTF KAWNLGDASR EPWLKATSI HQFPLEIGKH 

3 51 KYRLEFDQFT SMNVEDMSEG AEREKSLKST LNDVRAVTQE GKKYTNIGPS 

4 01 IVYRIRDAAG QAVEYKNYML PVLQEQDYFW ITGTRSGLQQ QYRWLRIPLD 
50 4 51 KQLKADTFMA LREFLKDGEG RKRLVADATK GAPAEIREQF MLAAENTLNI 

501 FAQKGYLGLD EFITSNIPKE QQDKMQGYFY EMLYGVMNAA LDETIRRYGL 

551 PEWQQDEARN RFLLHSMDAY TGLTEYPAPM LLQLDGFSEV RSSGLQMTRS 

601 PGALLVYLGS VLLVLGTVLM FYVREKRAWV LFSDGKIRFA MSSARSERDL 



55 



651 QKEFPKHVES LQRLGKDLNH D* 

ORF88a (SEP ID NO: 332) and ORF88-1 (SEP ID NO: 330) 100.0% identity in 671 aa overlap: 



or f 88a . pep MSKSRRSPPLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA 60 



CHIR-0160 (356.001) 



-286- 



PATENT 



10 



15 



20 



25 



30 



orf 88-1 
orf 88a .pep 
orf 88-1 
orf 88a .pep 
orf 88-1 
orf 88a .pep 
orf 88-1 
orf 88a .pep 
orf 88-1 
orf 88a .pep 
orf 88-1 
orf 88a .pep 
orf 88-1 
orf 88a .pep 
orf 88-1 
orf 88a .pep 
orf 88-1 
orf 88a .pep 
orf 88-1 
orf 88a .pep 
orf 88-1 



I I I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I II 
MSKSRRSPPLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA 



60 



QIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKS FREKVKEKSLAAMRH 120 

I II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 M I II 1 1 II I II 1 1 1 

Q I FG FLGL YDVYAS AWFW I MM FLWSTSLCL I RNVPPFWREMKS FREKVKEKSLAAMRH 120 
SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 180 

IMIMM llllllll IIMMMMMMMI MIIMM MMMMMIMI 

SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 180 



GGLIDSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADWF 

M 1 1 M 1 1 1 M I II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 M : 1 1 1 1 1 1 1 1 1 1 1 1 

GGL I DSNLLLKLGMLTGRI VPDNQAVYAKDFKPES I LGASNLS FRGNVNI SEGQS ADWF 
LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 

IMIMI llllllll lllillll IIMMII IIIIIIM lllllllhlllll 

LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 



240 



240 



300 



300 



360 



LHGITIYQASFADGGSDLTFKAWNLGDASREPWLKATSIHQFPLEIGKHKYRLEFDQFT 

MMMIM IMMIMMIMMM MIMMIMMMIIMMMMM MM 

LHGITIYQASFADGGSDLTFKAWNLGDASREPWLKATSIHQFPLEIGKHKYRLEFDQFT 3 60 



SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 

M I II I I I I II I M M M I I M I I I I I I I II M I I II II I I II I II I I I I Ml I I I I I 
SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 

PVLQEQDYFWITGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 

I II II II I II II 1 1 1 1 1 II 1 1 1 1 1 M 1 1 II 1 1 M 1 1 1 1 1 M 1 1 II M I II I M II II 

PVLQEQDYFWI TGTRSGLQQQYRWLRI PLDKQLKADTFMALREFLKDGEGRKRLVADATK 



420 



420 



480 



480 



GAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAA 54 0 

I II II 1 1 II Illllllllll Mill I II II I IMIMI II III III II Illllllllll I 

GAPAE I REQFMLAAENTLN I FAQKGYLGLDEF I TSNI PKEQQDKMQGYFYEMLYGVMNAA 54 0 
LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600 

III IMIIIMIII IMIIMIIIIIII IIMMIIIIIIIIMII IIMIMMIMM 

LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600 
PGALLVYLGSVLLVLGTVLMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660 

IIIIIMIIIIIIIIIIIIII IMMII IMIIIIIIIIIIMIII IIIIIIIIIIIIM 

PGALLVYLGSVLLVLGTVLMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660 



35 



orf 88a .pep 
orf 88-1 



LQRLGKDLNHD 

Illllllllll 
LQRLGKDLNHD 



672 
672 



Homology with a predicted ORF from N. gonorrhoeae 

ORF88 (SEP ID NO: 328) shows 93.8% identity over a 371 aa overlap with a predicted ORF 
(ORF88.ng) (SEP ID NO: 334) from N. gonorrhoeae: 



orf 88 pep MVFLNADNG ILVQDLPFEVKLKKFHIDFYNTGMPRDFASD I EVTDKATGEKLERT IRVNH 60 

40 | | | | | | | | | : | | || | | | | | | | || | | | | | || I I I I I I I I I M I I I II II I I I I I I I I I I I I 

orf 88ng MVFLNADNGMLVQDLPFEVKLKKFH I DFYNTGMPRDFASD I EVTDKATGEKLERT IRVNH 60 
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orf 88 .pep PLTLHGITIYQASFADGGSDLTFKAWNLGDASREPWLKATSIHQFPLEIGKHKYRLEFD 120 

Illlllllllllllllllllllllllll IIIIIIMIIIIII llllllllllllll Ml 

orf 88ng PLTLHGITIYQASFADGGSDLTFKAWNLRDASREPVVLKATSIHQFPLEIGKHKYRLEFD 120 

orf 88 . pep QFTSMNVEDMSEGAEREKSLKSTLPDVRAVTQEGHKYTNXXXXXXYRIRDAPGQAVEYKN 180 

I | | | | M I I I I I I I I I I I I II I I I I I I I I I I I I = I I I I I I I I I I I I I I I I I I 

orf 8 8ng QFTSMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPS I VYR I RDAAGQAVE YKN 180 

orf 88 .pep YMLPVLQEQDYFWITGTRSXLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRXVAD 240 

||||:||::||||:|llll I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I III 

orf 88ng YMLPI LQDKDYFWLTGTRSGLQQQYRWLRI PLDKQLKADTFMALREFLKDGEGRKRLVAD 24 0 

orf 88. pep ATKGAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDKMQGYFYEMLYGVM 300 

Ml Mill lllllill IIIIIIIMIIIIIII INI II IMIIIIIIIIII 

orf 8 8ng ATKDAPAE I REQFMLAAENTLN I FAQKGYLGLDE F I TSNI PKGQQDKMQGYFYEMLYGVM 3 00 

orf 8 8 . pep NAALDETXTRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQM 360 

lllllll 1 1 1 1 1 1 1 1 1 1 1 1 1 M M M 1 1 M 1 1 1 1 ■ 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 

orf 88ng NAALDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQM 360 

or f 8 8 . pep TRSXGPLLVYL 3 71 
III I Mill 

orf 88ng TRSPGALLVYLGSVLLVLGTVFMFYVPKKRAWVLFSNXKIRFAMSSARSERDLQKEFPKH 42 0 

An ORF88ng nucleotide sequence [<SEQ ID 333>] (SEP ID NO: 333) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 334>] (SEP ID NO: 334) : 



1 MVFLNADNGM LVQDLPFEVK LKKFHIDFYN TGMPRDFASD IEVTDKATGE 

51 KLERTIRVNH PLTLHGITIY QASFADGGSD LTFKAWNLRD ASREPWLKA 

101 TSIHQFPLEI GKHKYRLEFD QFTSMNVEDM SEGAEREKSL KSTLNDVRAV 

151 TQEGKKYTNI GPSIVYRIRD AAGQAVEYKN YMLPI LQDKD YFWLTGTRSG 

201 LQQQYRWLRI PLDKQLKADT FMALREFLKD GEGRKRLVAD ATKDAPAE I R 

251 EQFMLAAENT LNIFAQKGYL GLDEFITSNI PKGQQDKMQG YFYEMLYGVM 

301 NAALDETIRR YGLPEWQQDE ARNRFLLHSM DAYTGLTEYP APMLLQLDGF 

3 51 SEVRSSGLQM TRSPGA LLVY LGSVLLVLGT VFM FYVPKKR AWVLFSNXKI 

4 01 RFAMSSARSE RDLQKEFPKH VESLQRLGKD LNHD* 

Further work revealed the complete gonococcal DNA sequence [<SEQ ID 335>] fSEO ID NO: 
335) : 



1 ATGAGTAAAT CCCGTATATC TCCCACACTT CTTTCCCGTC CGTGGTTCGC 

51 TTTTTTCAGC TCCATGCGCT TTGCGGTCGC TTTGCTCAGT CTGCTGGGTA 

101 TTGCATCGGT TATCGGCACG GTGTTACAGC AAAACCAGCC GCAGACGGAT 

151 TATTTGGTCA AATTCGGACC GTTTTGGACT CGGATTTTTG ATTTTTTGGG 

2 01 TTTGTATGAT GTCTATGCTT CGGCATGGTT TGTCGTTATC ATGATGTTTC 

2 51 TGGTGGTTTC TACCAGTTTG TGTTTAATCC GTAACGTTCC GCCGTTTTGG 

301 CGCGAAATGA AGTCTTTCCG GGAAAAGGTT AAAGAAAAAT CTCTGGCGGC 

351 GATGCGCCAT TCTTCGCTGT TGGATGTAAA AATTGCCCCC GAAGTTGCCA 

4 01 AACGTTATCT GGAGGTGCGG GGTTTTCAGG GAAAAACCGT CAGCCGTGAG 

4 51 GACGGGTCGG TTCTGATTGC CGCCAAAAAA GGCAcaatga acaaATGGGG 

501 CTATATCTTT GCccaagtag ctTTGATTGT CATTTGCCTG GGCGGGTTGA 

551 TAGACAGTAA CCTGCTGCTG AAGCTGGGTA TGCTGGCCGG TCGGATTGTT 

601 CCGGACAATC AGGCGGTTTA TGCCAAGGAT TTCAAGCCCG AAAGTATTTT 

651 GGGTGCGTCC AATCTCTCAT TTAGGGGCAA CGTCAATATT TCCGAGGGGC 

701 AAAGTGCGGA TGTGGTTTTC CTGAATGCCG ACAACGGGAT GTTGGTTCAG 

751 GACTTGCCTT TTGAAGTCAA ACTGAAAAAA TTCCATATCG ATTTTTACAA 
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801 TACGGGTATG CCGCGCGATT TTGCCAGCGA TATTGAAGTA ACGGACAAGG 

851 CAACCGGTGA GAAACTCGAG CGCACCATCC GCGTGAACCA TCCTTTGACC 

901 TTGCACGGCA TCACGATTTA TCAGGCGAGT TTTGCCGACG GCGGTTCGGA 

951 TTTGACATTC AAGGCGTGGA ATTTGAGGGA TGCTTCGCGC GAACCTGTCG 

1001 TGTTGAAGGC AACCTCCATA CACCAGTTTC CGTTGGAAAT CGGCAAACAC 

1051 AAATATCGTC TTGAGTTCGA TCAGTTCACT TCTATGAATG TGGAGGACAT 

1101 GAGCGAGGGT GCGGAACGGG AAAAAAGCCT GAAATCCACT CTGAACGATG 

1151 TCCGCGCCGT TACTCAGGAA GGTAAAAAAT ACACCAATAT CGGCCCTTCC 

12 01 ATCGTGTACC GCATCCGTGA TGcggCAGGG CAGGCGGTCG AATATAAAAA 
1251 CTATATGCTG CCGATTTTGC AGGACAAAGA TTATTTTTGG CTGACCGGCA 
1301 CGCGCAGCGG CTTGCAGCAG CAATACCGCT GGCTGCGTAT CCCCTTGGAC 

13 51 AAGCAGTTGA AAGCGGACAC CTTTATGGCA TTGCGTGAGT TTTTGAAAGA 

14 01 TGGGGAAGGG CGCAAACGTC TGGTTGCCGA CGCAACCAAA GACGCACCTG 
14 51 CCGAAATCCG CGAACAATTC ATGCTGGCTG CGGAAAACAC GCTGAATATC 
1501 TTTGCGCAAA AAGGCTATTT GGGATTGGAC GAATTTATTA CGTCCAATAT 
1551 CCCGAAAGGG CAGCAGGATA AGATGCAGGG CTATTTCTAC GAAATGCTTT 
1601 ACGGCGTGAT GAACGCTGCT TTGGATGAAA CCATACGCCG GTACGGCTTG 
1651 CCCGAATGGC AGCAGGATGA AGCGCGGAAC CGTTTCCTGC TGCACAGTAT 
1701 GGATGCCTAT ACGGGGCTGA CGGAATATCC CGCGCCTATG CTGCTCCAGC 
1751 TTGACGGGTT TTCCGAGGTG CGTTCCTCAG GTTTGCAGAT GACCCGTTCG 
1801 CCGGGTGCGC TTTTGGTCTA TCtcggctcg gtattgttgg TTTTGGgtac 
1851 ggtaTttatg tTTTATGTGC GCGAAAAACG GGCGTGGgta tTGTTTTCag 
1901 aCGGCAAAAT CCGTTTTGCT ATGtCTTcgg CCcgcagcga ACGGGATTTG 
1951 cAGAaggaaT TTCCAAAACA CGtcgAGAGC CTGCAACggc tcggcaaggA 
2001 CttgaaTCAT GACTga 



This corresponds to the amino acid sequence [<SEQ ID 336; ORF88ng-l>] (SEP ID NO: 336; 
PRF88ng-l) : 



1 MSKSRISPTL LSRPWFAFFS SMRFA VALLS LLGIASVIGT VL QQNQPQTD 

51 YLVKFGPFWT RIFDFLGLYD VYASAW FWI MMFLWSTSL CLI RNVPPFW 

101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVR GFQGKTVSRE 

151 DGSVLIAAKK GTMNKWG Y I F AQVALIVICL GGLI DSNLLL KLGMLAGRIV 

2 01 PDNQAVYAKD FKPESILGAS NLSFRGNVNI SEGQSADWF LNADNGMLVQ 
251 DLPFEVKLKK FHIDFYNTGM PRDFASDIEV TDKATGEKLE RTIRVNHPLT 
301 LHGITIYQAS FADGGSDLTF KAWNLRDASR EPWLKATSI HQFPLEIGKH 

3 51 KYRLEFDQFT SMNVEDMSEG AEREKSLKST LNDVRAVTQE GKKYTNIGPS 
401 IVYRIRDAAG QAVEYKNYML PILQDKDYFW LTGTRSGLQQ QYRWLRIPLD 

4 51 KQLKADTFMA LREFLKDGEG RKRLVADATK DAPAEIREQF MLAAENTLNI 
501 FAQKGYLGLD EFITSNIPKG QQDKMQGYFY EMLYGVMNAA LDETI RR-YGL 
551 PEWQQDEARN RFLLHSMDAY TGLTEYPAPM LLQLDGFSEV RSSGLQMTRS 
601 PG ALLVYLGS VLLVLGTVFM FYVREKRAWV LFSDGKIRFA MSSARSERDL 
651 QKEFPKHVES LQRLGKDLNH D* 



ORF88ng-l (SEP ID NO: 336) and ORF88-1 (SEP ID NO: 330) show 97.0% identity in 671 aa 



orf 88-1. pep MSKSRRSPPLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA 60 



orf 88 - 1 . pep QIFGFLGLYDVYASAWFWIMMFLWSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 120 



overlap: 



orf 88ng-l 




orf 88ng-l 




orf 88-1 .pep SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 180 
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1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 = 1 1 1 1 1 1 = = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 = 1 1 1 1 1 1 1 1 

orf 88ng- 1 SSLLDVKIAPEVAKRYLEVRGFQGKTVSREDGSVLIAAKKGTMNKWGYIFAQVALIVICL 180 
orf 88-1 .pep GGLIDSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADWF 240 

IIIMIIIIII illMMIIIIIIIIIMIMIIIIIIIIMIIIIIMIIIIMIII! 

5 orf 88ng-l GGL I DSNLLLKLGMLAGR I VPDNQAVYAKDFKPES I LGASNLS FRGNVN I SEGQS ADWF 24 0 

orf 88-1 .pep LNADNGI LVQDLP FE VKLKKFH I DF YNTGMPRDFASD I EVTDKATGEKLERT I RVNHPLT 300 

I M 1 1 1 : I I J l ! 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 

orf 88ng-l LNADNGMLVQDLPFE VKLKKFH I DFYNTGMPRDFASD I EVTDKATGEKLERT I RVNHPLT 300 
orf 88-1 .pep LHGITIYQASFADGGSDLTFKAWNLGDASREPWLKATSIHQFPLEIGKHKYRLEFDQFT 360 

10 Illllllllllllllllllllllll III III I MM II II 1 1 MM I II III I II I III 

orf 88ng-l LHGITIYQASFADGGSDLTFKAWNLRDASREPWLKATSIHQFPLEIGKHKYRLEFDQFT 360 

orf 8 8-1 .pep SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 42 0 

I I I I I II II II II I I I I I I I I I I I I I I I I I I M I I I M I II II I II II I II I I M I I II I 
orf 8 8ng- 1 SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPS I VYR I RDAAGQAVE YKNYML 42 0 

15 orf 88-1 .pep PVLQEQDYFWITGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 4 80 

I : I I :: I M I : I I I I M II M I I M I I I M I I I II II II II I I II II I I I I I I I I I I I I I 
orf 8 8ng- 1 PILQDKDYFWLTGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 4 8 0 

orf 88-1 .pep GAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAA 540 

I I II I I M II II I I II I I I I I I I I I I I I I I II I I I I II I I I I I I I I I M I I I : I 
20 orf 88ng-l DAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKGQQDKMQGYFYEMLYGVMNAA 54 0 

orf 88 - 1 . pep LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600 

1 1 1 1 1 1 1 1 1 1 1 1 M I M I M II 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 II I I II i 

orf 88ng- 1 LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600 
orf 88-1 .pep PGALLWLGSVLLVLGTVLMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660 

25 | || II II II I II II I I I h I I 1 1 II I I 1 1 1 1 II 1 1 1 1 I II I II I II I 1 1 1 1 I I I I I II I I 

orf 88ng- 1 PGALLVYLGSVLLVLGTVFMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660 



30 



orf 88-1. pep LQRLGKDLNHD 671 

IMIIIIMM 
orf88ng-l LQRLGKDLNHD 671 



Furthermore, ORG88ng-l (SEP ID NO: 336) shows homology with a hypothetical protein (SEP 
ID NO: 1 134) from Aquifex aeolicus: 



35 



gi 1 2984296 (AE000771) hypothetical protein [Aquifex aeolicus] Length = 537 
Score = 94.4 bits (231), Expect = 2e-18 

Identities = 91/334 (27%), Positives = 159/334 (47%), Gaps = 59/334 (17%) 



Query: 16 FAFFSSMRFAVALLSLLGIASVIG-TVLQQNQPQTDYLVKFGPFWTRIFDFLGLYDVYAS 74 

+ F +S++ A+ ++ +LGI S++G T ++QNQ YL +FG L L DV+ S 

Sbjct: 80 YDFLASLKLAIFIMLVLGILSMLGSTYIKQNQSFEWYLDQFGYDVGIWIWKLWLNDVFHS 139 
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Query: 75 AWFWIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRHSSLLDVKIAPEVAK 134 

+ + + + ++ L V+ C 1+ +P W++ S +E+ + + A +H + VKI P+ K 
Sbjct: 140 WYYILFIVLLAVNLIFCSIKRLPRVWKQAFS-KERILKLDEHAEKHLKPITVKI - PDKDK 197 



Query: 135 - - RYLEVRGFQGKTVSREDGS VL I AAKKGTMNKWGY I FAQVAL I VI CLGGL I DSNLLLKL 192 

++L +GF+ V E + + A+KG + + G +AL+VI G LID 
Sbjct: 198 VLKFLLKKGFK-VFVEEEGNKLYVFAEKGRFSRLGVYITHIALLVIMAGALID 249 
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Query: 


193 


GMLAGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADWFLNADNGMLVQDL 


252 






+ I+G 


RG++ ++EG + DV+ + A+ L 




Sbjct: 


250 


AIVGV 


- RGSL I VAEGDTNDVMLVGAE - - QKPYKL 


280 


Query : 


253 


PFEVKLKKFHIDFY NTGMPRDFA 


- -SDIEVTDKATGEKLER- -TIRVNHPLT 


300 






PFVLFIY N++FA 


SDIE+ + G K+E T++VN P 




Sbjct: 


281 


PFAVHLIDFRIKTYAEENPNVDKRFAQAVSSYESDIEI IN GGKVE AKGT VKVNE P FD 


337 


Query: 


301 


LHGITIYQASFA- -DGGSDLTFKAWNLRDASREP 332 








++QA++ DG S + + + A 


+ P 




Sbjct: 


338 


FGRYRL FQ AT YG I LDGTS GMG V I WDRKKAHED P 371 





Based on this analysis, including the putative transmembrane domain in the gonococcal protein, it 
is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 
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The following DNA sequence, believed to be complete, was identified in N. meningitidis [<SEQ ID 
337>] (SEP ID NO: 337) : 

1 ATGATGAGTA ATAmAATGGm ACAAAAAGGG TTTACATTGA TTGmGmTGAT 

51 GATAGTCGTC GCGATACTCG GCATTATCAG CGTCATTGCC ATACGTTCTT 

101 ATCmAAGT'TA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG 

151 GyCGGTATCA ACAATATTTC CAAACAGTTT ATTTTGAAAA ATCCCCTGGA 

201 CGATAATCAG ACCATCGAGA ACAAACTGGA AATATTTGTC TCAGGCTATA 

251 AGATGAATCC GAAAATTGCC AAAAAaTATA GTGTTTCGGT AAAGTTTGTC 

301 GATAAGGAAA AATCAAGGGC ATACAGGTTG GTCGGCGTTC CGAAGGCGGG 

351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA 

4 01 AATGCCGTGA TGCCGCTTCT GCCCAAGCCC ATTTGGAGAC CTTGTCCTCA 

451 GATGTCGGCT GTGAAGCCTT CTCTAATCGT AAAAAATAA 

This corresponds to the amino acid sequence [<SEQ ID 338; ORF89>] (SEP ID NO: 338; 
PRF89) : 



1 MMSNXMXQKG FTLIXXMIW AILGIISVIA IPSYXSYIEK GYQSQLYTEM 
51 XGINNISKQF ILKNPLDDNQ TIENKLEIFV SGYKMNPKIA KKYSVSVKFV 
101 DKEKSRAYRL VGVPKAGTGY TLSVWMNSVG DGYKCRDAAS AQAHLETLSS 
151 DVGCEAFSNR KK* 

Further work revealed the complete nucleotide sequence [<SEQ ID 339>] (SEP ID NP: 339) : 



1 ATGATGAGTA ATAAAATGGA ACAAAAAGGG TTTACATTGA TTGAGATGAT 

51 GATAGTCGTC GCGATACTCG GCATTATCAG CGTCATTGCC ATACCTTCTT 

101 ATCAAAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG 

151 GTCGGTATCA ACAATATTTC CAAACAGTTT ATTTTGAAAA ATCCCCTGGA 

201 CGATAATCAG ACCATCGAGA ACAAACTGGA AATATTTGTC TCAGGCTATA 

251 AGATGAATCC GAAAATTGCC AAAAAATATA GTGTTTCGGT AAAGTTTGTC 

301 GATAAGGAAA AATCAAGGGC ATACAGGTTG GTCGGCGTTC CGAAGGCGGG 

351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA 

401 AATGCCGTGA TGCCGCTTCT GCCCAAGCCC ATTTGGAGAC CTTGTCCTCA 
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451 GATGTCGGCT GTGAAGCCTT CTCTAATCGT AAAAAATAA 

This corresponds to the amino acid sequence [<SEQ ID. 340; ORF89-l>] (SEP ID NO: 340; 
PRF89-1) : 

5 1 MMSNKMEQKG FTLIEMMIW AILGIISVIA IPSYQSYIEK GYQSQLYTEM 

51 VGINNISKQF I LKNPLDDNQ TIENKLEIFV SGYKMNPKIA KKYSVSVKFV 
101 DKEKSRAYRL VGVPKAGTGY TLSVWMNSVG DGYKCRDAAS AQAHLETLSS 
151 DVGCEAFSNR KK* 

10 Computer analysis of this amino acid sequence gave the following results: 

Homology with PilE of N. gonorrhoeae (accession number Z69260) (SEP ID NO: 1 135). 

ORF89 (SEP ID NP: 338) and PilE protein (SEP ID NP: 1135) show 30% aa identity in 120a 
overlap: 

orf 

15 QKGFTLI MIV+AI+GI + + +A+P+Y Y + S+ G + ++L + + 

QKGFTLIELMIVIAIVGILAAVALPAYQDYTARAQVSEAILLAEGQKSAVTEYYLNHGIW 64 

-DDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGYTLSVW 12 i 
DN + +G + KI KY SV + GV K G LS+W 



orf89 


8 


PilE 


5 


orf 89 


67 


PilE 


65 



20 Homology with a predicted PRF from N. meningitidis (strain A) 

PRF89 (SEP ID NP: 338) shows 83.3% identity over a 162aa overlap with an PRF (PRF89a) 
(SEP ID NP: 342) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 89 .pep MMSNXMXQKGFTLIXXMIWAILGIISVIAIPSYXSYIEKGYQSQLYTEMXGINNISKQF 

25 Mill I II MM 1 1 II III I MM II I Ml Ml II I 1 1 1 1 1 1 1 1 

orf 89a MMSNKMEQKGFTLIXXXXXXAIXXXXSVIXXXXYXSYIEKGYQSQLYTEMVGINNISKQX 

10 20 30 40 50 60 

70 80 90 100 110- 120 

orf 89 .pep I LKNPLDDNQT I ENKLE I FVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRL VGVPKAGTGY 

30 | M I I I I II I I h : I I I I I I II I I I I I II h I h II h I h : I I I I I I I II I h II I I 

orf 89a ILKNPLDDNQTIKSKLEIFVSGYKMNPKIAEKYNVSVHFVNEEKPRAYSLVGVPKTGTGY 

70 80 90 100 110 120 

130 140 150 160 

orf 89 . pep TLSVWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKKX 

35 1 1 1 1 II I II I II I II II 1 1 1 1 Ml I II 1 1 II M 1 1 II 1 1 II 1 1 

or f 8 9 a TLS VWMNS VGDG YKCRDAAS ARAHLETLS SD VGCEAFSNRKKX 

130 140 150 160 



The complete length PRF89a nucleotide sequence [<SEQ ID 34 1>] (SEPIDNP: 341) is: 
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1 ATGATGAGTA ATAAAATGGA ACAAAAAGGG TTTACATTGA TTGNGANGNT 

51 NATNGNCNTC GCGATACNCN GCNTTANCAG CGTCATTNCN ATNNNTNCNT 

101 ATCNNAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG 

151 GTCGGTATCA ACAATATTTC CAAACAGTNT ATTTTGAAAA ATCCCCTGGA 

5 2 01 CGATAATCAG ACCATCAAGA GCAAACTGGA AATATTTGTC TCAGGCTATA 

251 AGATGAATCC GAAAATTGCC GAAAAATATA ATGTTTCGGT GCATTTTGTC 

3 01 AATGAGGAAA AACCNAGGGC ATACAGCTTG GTCGGCGTTC CAAAGACGGG 
351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA 

4 01 AATGCCGTGA TGCCGCTTCT GCCCGAGCCC ATTTGGAGAC CTTGTCCTCA 
10 4 51 GATGTCGGCT GTGAAGCCTT CTCTAATCGT AAAAAATAG 

This encodes a protein having amino acid sequence [<SEQ ID 342>] (SEP ID NO: 342) : 

1 MMSNKMEQKG FTLIXXXXXX AIXXXXSVIX XXXYXSYIEK GYQSQLYTEM 
51 VGINNISKQX ILKNPLDDNQ TIKSKLEIFV SGYKMNPKIA EKYNVSVHFV 
15 101 NEEKPRAYSL VGVPKTGTGY TLSVWMNSVG DGYKCRDAAS ARAHLETLSS 

151 DVGCEAFSNR KK* 

ORF89a (SEP ID NO: 342) and ORF89-1 (SEP ID NO: 340) show 83.3% identity in 162 aa 
overlap: 

20 10 20 30 40 50 60 

or f 8 9a. pep MMSNKMEQKGFTLIXXXXXXAIXXXXSVIXXXXYXSYIEKGYQSQLYTEMVGINNISKQX 

IIIIIIIIMIIII II Ml I Ml I i I I I I I I i I I I I I I I I I I I 

orf 89-1 MMSNKMEQKGFTLIEMMIVVAILGIISVIAIPSYQSYIEKGYQSQLYTEMVGINNISKQF 

10 20 30 40 50 60 

25 70 80 90 100 110 120 

orf 89a . pep ILKNPLDDNQTIKSKLEIFVSGYKMNPKIAEKYNVSVHFVNEEKPRAYSLVGVPKTGTGY 

II 1 1 1 1 1 II 1 1 h M I ! 1 1 1 1 1 1 M II 1 1 h I h 1 1 h I h M I III II I M h II 1 1 

orf 8 9-1 ILKNPLDDNQTIENKLEI FVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY 

70 80 90 100 110 120 

30 130 140 150 160 

orf 89a . pep TLSVWMNSVGDGYKCRDAASARAHLETLSSDVGCEAFSNRKKX 

i I I I ! I I I I i I I I I I I I I M : I I I I I'l I i i I I I I I I I I I I I 
or f 8 9 - 1 TLSVWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKKX 

130 140 150 160 

35 Homology with a predicted PRF from N. gonorrhoeae 

PRF89 (SEP ID NP: 338) shows 84.6% identity over a 162aa overlap with a predicted PRF 
(PRF89.ng) (SEP ID NP: 344) from N. gonorrhoeae: 

orf 8 9 MMSNXMXQKGFTLIXXMIWAILGIISVIAIPSYXSYIEKGYQSQLYTEMXGINNISKQF 60 

MM I MMMI 1 1 1 1 = 1 1 M 1 1 1 1 1 1 1 1 1 MMMMMMMI lllh Ml 

40 orf89ng MMSNKMEQKGFTLIEMMIWTILGIISVIAIPSYQSYIEKGYQSQLYTEMVGINNVLKQF 60 

orf 8 9 ILKNPLDDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY 120 

Mill Ihh-IM Mllllll IMMIM :IM II IMMMIhlMM 

or f 8 9ng I LKNPQDDNDTLKSKLKI FVSGYKMNPKIAKKYSVSVRFVDAEKPRAYRLVGVPNAGTGY 12 0 



orf 89 



TLS VWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKK 162 
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orf89ng TLSVWMNSVGDGYKCRDATSAQAYSDTLSADSGCEAFSNRKK 162 

The complete length ORF89ng nucleotide sequence [<SEQ ID 343>] (SEP ID NO: 343) is: 



5 



20 



1 


aTGATGAGCA 


ATAAAATGGA 


ACAAAAAGGG 


TTTACATTGA 


TTGAGATGAT 


51 


GATAGTTGTC 


ACGATACTCG 


GCATCATCAG 


CGTCATTGCC 


ATACCTTCTT 


101 


ATCAGAGTTA 


TATTGAAAAA 


GGCTATCAGT 


CCCAGCTTTA 


TACGGAGATG 


151 


GTCGGTATCA 


ACAATGTTCT 


CAAACAGTTT 


ATTTTGAAAA 


ATCCCCAGGA 


201 


CGATAATGAT 


ACCCTCAAGA 


GCAAACTGAA 


AATATTTGTC 


TCAGGCTATA 


251 


AGATGAATCC 


GAAAAttgCC 


AAAAAATATA 


GTGTTTCGGt 


aaggtttGTC 


301 


gatGCGGAAA 


AACCAAGGGC 


ATACAGGTTG 


GTCGGCGTTC 


CGAACGCGGG 


351 


GACGGGTTAT 


ACTTTGTCGG 


TATGGATGAA 


CAGCGTGGGC 


GACGGATACA 


401 


AATGCCGTGA 


TGCCACTTCT 


GCCCAGGCCT 


ATTCGGACAC 


CTTGTCCGCA 


451 


GATAGCGGCT 


GTGAAGCTTT 


CTCTAATCGT 


AAAAAATAG 




This encodes a 


protein having amino acid sequence fSEP ID NO: 


344): 


i 


MMSNKMEQKG 


FTLIEMMIW 


TILGIISVIA 


IPSYQSYIEK 


GYQSQLYTEM 


51 


VGINNVLKQF 


ILKNPQDDND* TLKSKLKIFV 


SGYKMNPKIA 


KKYSVSVRFV 


101 


DAEKPRAYRL 


VGVPNAGTGY 


TLSVWMNSVG 


DGYKCRDATS 


AQAYSDTLSA 


151 


DSGCEAFSNR 


KK* 









This gonococcal protein has a putative leader peptide (underlined) and N-terminal methylation site 
(NMePhe or type-4 pili, double-underlined). In addition, ORF89ng (SEP ID NO: 344) and ORF89- 
1 (SEP ID NO: 340) show 88.3% identity in 162 aa overlap: 



25 10 20 30 40 50 60 

orf 89-1. pep MMSNKMEQKGFTLI EMMI WA I LGI I SVI AI PS YQS YIEKGYQSQLYTEMVGINNI SKQF 

I M 1 1 M 1 1 1 1 1 M 1 1 1 1 M 1 1 II 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h III 

orf 89ng MMSNKMEQKGFTLI EMMI WT I LGI I S VI AI PSYQSYI EKGYQSQLYTEMVGINNVLKQF 

10 20 30 40 50 60 



30 70 80 90 100 110 120 

orf 89-1. pep ILKNPLDDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY 

Mill Il|:|:::||:|llllllllllllllllllhlll II I I I I I I I h I I I I I 
orf 89ng ILKNPQDDNDTLKSKLKIFVSGYKMNPKIAKKYSVSVRFVDAEKPRAYRLVGVPNAGTGY 

70 80 90 100 110 120 



35 130 140 150 160 

orf 89-1. pep TLS VWMNS VGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKKX 

I I M I I I I I I I Ml I I M II h : I I I : I IIMIIIIIII 
or f 8 9ng TLSVWMNSVGDGYKCRDATSAQAYSDTLSADSGCEAFSNRKKX 

130 140 150 160 



40 



Based on this analysis, including the gonococcal motifs and the homology with the known PilE 
protein (SEP ID NP: 1135) , it was predicted that these proteins from N. meningitidis and 
N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 



raising antibodies. 
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PRF89-1 (SEP ID NO: 340) (13.6kDa) was cloned in the pGex vector and expressed in E.coli, as 
described above. The products of protein expression and purification were analyzed by SDS- 
PAGE. Figure 11A shows the results of affinity purification of the GST-fusion protein. Purified 
GST-fusion protein was used to immunise mice, whose sera gave a positive result in the ELISA 
test., confirming that ORF89-1 (SEP ID NO: 340) is a surface-exposed protein, and that it is a 
useful immunogen. 



The following partial DNA sequence was identified in N .meningitidis [<SEQ ID 345>] (SEP ID 



1 ATGAAAAAAT CCTCCCTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT 

51 CGGCATGGCA TTTGCCGCCC CTGCCGACGC GGTAAGCCAA ATCCGTCAAA 

101 ACGCCACTCA AGTATTGAGC ATCTTAAAAA ACGGCGATGC CAACACCGCT 

151 CGCCAAAAAG CCGAAGCCTA TGCGATTCCC TATTTCGATT TCCAACGTAT 

2 01 GACCGCATTG GCGGTCGGCA ACCCTTGGsG CACCG.GTCC GACG . GCAAA 

2 51 AACAAGCGTT GGCCn.AGAA TTTCAACCC . . . 



This corresponds to the amino acid sequence [<SEQ ED 346; PRF91>] (SEP ID NP: 346: 



Further work revealed the complete nucleotide sequence [<SEQ ID 347>] (SEP ID NP: 347) : 



1 ATGAAAAAAT CCTCCCTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT 

51 CGGCATGGCA TTTGCCGCCC CTGCCGACGC GGTAAGCCAA ATCCGTCAAA 

101 ACGCCACTCA AGTATTGAGC ATCTTAAAAA ACGGCGATGC CAACACCGCT 

151 CGCCAAAAAG CCGAAGCCTA TGCGATTCCC TATTTCGATT TCCAACGTAT 

201 GACCGCATTG GCGGTCGGCA ACCCTTGGCG CACCGCGTCC GACGCGCAAA 

251 AACAAGCGTT GGCCAAAGAA TTTCAAACCC TGCTGATCCG CACCTATTCC 

301 GGCACGATGC TGAAATTAAA AAACGCCAAC GTCAACGTCA AAGACAATCC 

351 CATCGTCAAT AAAGGCGGCA AAGAAATCAT CGTCCGCGCC GAAGTCGGCG 

4 01 TACCCGGGCA AAAACCCGTC AACATGGACT TCACCACCTA CCAAAGCGGC 

4 51 GGTAAATACC GTACCTACAA CGTCGCCATC GAAGGCGCGA GCCTGGTTAC 

501 CGTGTACCGC AACCAATTCG GCGAAATTAT CAAAGCGAAA GGCGTGGACG 

551 GACTGATTGC CGAGTTGAAA GCCAAAAACG GCGGCAAATA A 



This corresponds to the amino acid sequence [<SEQ ID 348; PRF91-1>] (SEP ID NP: 348: 
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NP: 345) : 



PRF91) : 



1 MKKSSLISAL GIGILSIGMA FAAPADAVSQ IRQNATQVLS ILKNGDANTA 
51 RQKAEAYAIP YFDFQRMTAL AVGNPWXTXS DXQKQALAXE FQP . . . 



PRF91-1) : 



. i 

51 
101 



MKKSSLISAL GIGILSIGMA FAA PADAVSQ IRQNATQVLS 
RQKAEAYAIP YFDFQRMTAL AVGNPWRTAS DAQKQALAKE 
GTMLKLKNAN VNVKDNPIVN KGGKEIIVRA EVGVPGQKPV 



ILKNGDANTA 
FQTLLIRTYS 
NMDFTTYQSG 
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151 GKYRTYNVAI EGASLVTVYR NQFGEIIKAK GVDGLIAELK AKNGGK* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with a predicted ORF from N. meningitidis (strain A) 

5 ORF91 (SEP ID NO: 346) shows 92.4% identity over a 92aa overlap with an ORF (ORF91a) 
(SEP ID NO: 350) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 91 .pep MKKS SL I S ALG I G I LS I GMAFAAPADAVSQ I RQNATQVLS I LKNGDANTARQKAEAYAI P . 

MMMMMMMMMI MMMMMMMMMMMMMMMM i MM 

10 orf91a MKKS S F I S ALG IGILSI GMAFAAPADAVNQ I RQNATQVLS I LKSGDANTARQKAEAYA I P 

10 20 30 40 50 60 

70 80 90 

orf 91 . pep YFDFQRMTALAVGNPWXTXSDXQKQALAXEFQP 

IIMIIIIIIMIIII I II Mllll Ml 

1 5 orf 9 la YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPIVN 

70 80 90 100 110 120 

orf 91a KGGKEI I VRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAI EGASLVTVYRNQFGE 1 1 KAK 

130 140 150 160 170 180 

20 The complete length PRF91 a nucleotide sequence [<SEQ ID 349>] (SEP ID NP: 349) is: 

t 

1 ATGAAAAAAT CCTCCTTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT 

51 -CGGCATGGCA TTTGCCGCCC CTGCCGACGC GGTAAACCAA ATCCGTCAAA 

101 ACGCCACTCA AGTATTGAGC ATCTTAAAAA GCGGTGATGC CAACACCGCC 

151 CGCCAAAAAG CCGAAGCCTA TGCGATTCCC TATTTCGATT TCCAACGTAT 

25 - 201 GACCGCATTG GCGGTCGGCA ACCCTTGGCG CACCGCGTCC GACGCGCAAA 

251 AACAAGCGTT GGCCAAAGAA TTTCAAACCC TGCTGATCCG CACCTATTCC 

301 GGCACGATGC TGAAATTAAA AAACGCCAAC GTCAACGTCA AAGACAATCC 

351 CATCGTCAAT AAAGGCGGCA AAGAAATCAT CGTCCGCGCC GAAGTCGGCG 

4 01 TACCCGGGCA AAAACCCGTC AACATGGACT TCACCACCTA CCAAAGCGGC 

30 4 51 GGTAAATACC GTACCTACAA CGTCGCCATC GAAGGCGCGA GCCTGGTTAC 

501 CGTGTACCGC AACCAATTCG GCGAAATTAT CAAAGCGAAA GGCGTGGACG 

551 GACTGATTGC CGAGTTGAAG GCTAAAAACG GCAGCAAGTA A 

This encodes a protein having amino acid sequence [<SEQ ID 350>] (SEP ID NP: 350) : 



35 1 MKKSSFISAL GIGILSIGMA FA APADAVNQ I RQNATQVLS ILKSGDANTA 

51 RQKAEAYAIP YFDFQRMTAL AVGNPWRTAS DAQKQALAKE FQTLLIRTYS 

101 GTMLKLKNAN VNVKDNPIVN KGGKEI I VRA EVGVPGQKPV NMDFTTYQSG 

151 GKYRTYNVAI EGASLVTVYR NQFGEIIKAK GVDGLIAELK AKNGSK* 

40 PRF91a (SEP ID NP: 350) and PRF91-1 (SEP ID NP: 348) show 98.0% identity in 196 aa 
overlap: 



10 20 30 40 50 60 

orf 91a. pep MKKSSFISALGIGILS I GMAFAAPADAVNQ I RQNATQVLS I LKSGDANTARQKAEAYA I P 
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Mil MMIMIIIIIIIIIMIIMMMIIIIMIIMMIIMIMI lllllll 

orf 91-1 MKKS S L I S ALG I G I LS I GMAFAAP ADAVS Q I RQNATQVLS I L KNGDANTARQ KAEAYA I P 

10 20 30 40 50 60 

70 80 90 100 110 120 

5 orf 91a. pep YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPI VN 

1 1 1 1 1 1 1 1 1 1 1 1 1 h 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 II I M 1 1 1 1 1 1 

or f 9 1 - 1 YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPI VN 

70 80 90 100 110 120 

130 140 150 160 170 180 

10 orf 91a. pep KGGKEI IVRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEI IKAK 

I ! 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 II I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 

orf 91-1 KGGKEI I VRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEI IKAK 

130 140 150 160 170 180 

190 

15 orf 9 la. pep GVDGLIAELKAKNGSKX 

IIIIIIIIIMIIhll 
or f 9 1 - 1 GVDGLIAELKAKNGGKX 

190 

Homology with a predicted ORF from N. gonorrhoeae 

20 ORF91 (SEP ID NO: 346) shows 84.8% identity over a 92aa overlap with a predicted ORF 
. (ORF9 1 .ng) (SEP ID NO: 352) from N. gonorrhoeae: 

orf 91 . pep MKKSSL I SALG I GI LS IGMAFAAPADAVSQ I RQNATQVLS I LKNGDANTARQ KAEAYA I P 60 

: I I I I : I I I I I I I I I I I I I I hi I . I h I I I I I I I I I H I I : I I I :|| IIIMhl 
orf 91ng VKKSS F I SALG I GI LS IGMAFAS PADAVGQ I RQNATQVLT I LKSGDAASARPKAEAYAVP 60 

25 orf 91. pep YFDFQRMTALAVGNPWXTXSDXQKQALAXEFQP 93 

I i I I I I I I 1 I I ' I I | || IIMM III 
or f 9 lng YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKFKNATVNVKDNPIVN 120 

The complete length PRF91ng nucleotide sequence [<SEQ ID 35 1>] (SEP ID NP: 351) is 
30 predicted to encode a protein having amino acid sequence [<SEQ ID 352>] (SEP ID NP: 352) : 

1 VKKSSFISAL GIGILSIGMA FA S PADAVGQ I RQNATQVLT ILKSGDAASA 

51 RPKAEAYAVP YFDFQRMTAL AVGNPWRTAS DAQKQALAKE FQTLLIRTYS 

101 GTMLKFKNAT VNVKDNPIVN KGGKEIWRA EVGIPGQKPV NMDFTTYQSG 

151 GKYRTYNVAI EGTSLVTVYR NQFGEIIKAK GIDGLIAELK AKNGGK* 



35 



Further work revealed the complete nucleotide sequence [<SEQ ID 353>] (SEP ID NP: 353) : 



1 ATGAAAAAAT CCTCCTTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT 

51 CGGCATGGCA TTTGCCTCCC CGGCCGACGC AGTGGGACAA ATCCGCCAAA 

101 ACGCCACACA GGTTTTGACC ATCCTCAAAA GCGGCGACGC GGCTTCTGCA 

40 151 CGCCCAAAAG CCGAAGCCTA TGCGGTTCCC TATTTCGATT TCCAACGTAT 

201 GACCGCATTG GCGGTCGGCA ACCCTTGGCG TACCGCGTCC GACGCGCAAA 

2 51 AACAAGCGTT GGCCAAAGAA TTTCAAACCC TGCTGATCCG CACCTATTCC 
301 GGCACGATGC TGAAATTCAA AAACGCGACC GTCAACGTCA AAGACAATCC 

3 51 CATCGTCAAT AAGGGCGGCA AGGAAATCGT CGTCCGTGCC GAAGTCGGCA 
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4 01 TCCCCGGTCA GAAGCCCGTC AATATGGACT TTACCACCTA CCAAAGCGGC 

4 51 GGCAAATACC GTACCTACAA CGTCGCCATC GAAGGCACGA GCCTGGTTAC 

501 CGTGTACCGC AACCAATTCG GCGAAATCAT CAAAGCCAAA GGCATCGACG 

551 GGCTGATTGC CGAGTTGAAA GCCAAAAACG GCGGCAAATA A 

This corresponds to the amino acid sequence [<SEQ ID 354; ORF91ng-l>] (SEP ID NO: 354; 
PRF91ng-l) : 



1 MKKSSFISAL GIGILSIGMA FA SPADAVGQ IRQNATQVLT ILKSGDAASA 

51 RPKAEAYAVP YFDFQRMTAL AVGNPWRTAS DAQKQALAKE FQTLLIRTYS 

101 GTMLKFKNAT VNVKDNPIVN KGGKEIWRA EVGIPGQK.PV NMDFTTYQSG 

151 GKYRTYNVAI EGTSLVTVYR NQFGEIIKAK GIDGLIAELK AKNGGK* 

ORF91ng-l (SEP ID NO: 354) and PRF91-1 (SEP ID NO: 348) show 92.3% identity in 196 aa 
overlap: 



10 20 30 40 50 60 

MKKSSL I SALG IGILSI GMAFAAPADAVSQ I RQNATQ VLS I LKNGDANTARQKAEAYA I P 

Mllhlllllllllllllllhllllhlllllllllhllhlll HI lllllhl 
MKKS S F I SALG IGILSI GMAFAS PADAVGQ I RQNATQ VLT I LKS GDAAS ARP KAEAYAVP 
10 20 30 40 50 60 

70 80 90 100 110 120 

YFDFQRMTA]^VGNPWRTASDAQKQALAK£FQTLLIRTYSGTMLKLKNA2JVNVKDNPIVN 

IIIMIIIMIIIIIIIMI IIIMIIIIMMIIIMMIIIIMIMMIMMM 

YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKFKNATVNVKDNPIVN 
70 80 90 100 110 120 

130 140 150 160 170 180 

KGGKEI IVRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEI IKAK 
I I I II h I I I I I h I I I I I I I I I I II I I I I I Ml I I M I I :| I I I I I I i ■ I I I I I I I I 
KGGKE I WRAE VG I PGQKPVNMD FTT YQS GGKYRT YNVA I EGTS L VT VYRNQ FGE 1 1 KAK 
130 140 150 160 170 180 

190 

GVDGLIAELKAKNGGKX 

hlllllllllllllll 
GIDGLIAELKAKNGGKX 

190 

In addition, PRF91ng-l (SEP ID NP: 354) shows homology to a hypothetical Kcoli protein (SEP 
IDNP: 1136) : 

sp|P4 53 90|YRBC_ECOLI HYPOTHETICAL 24.0 KD PROTEIN IN MURA-RPON INTERGENIC REGION 
PRECURSOR (F211) )gi|606130 (U18997) ORF_f211 [Escherichia coli] )gi|l789583 
(AE000399) hypothetical 24.0 kD protein in murZ-rpoN intergenic region [Escherichia 
coli] Length = 211 

Score = 70.6 bits (170), Expect = 6e-12 

Identities = 42/137 (30%), Positives = 76/137 (54%), Gaps = 6/137 (4%) 

Query: 59 VPYFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKFKNATVNVKDNPI 118 

+ PY + AL +G + + +A+ AQ++A F+L + Y + + T + P 
Sbjct: 65 LPYVQVKYAGALVLGQYYKSATPAQREAYFAAFREYLKQAYGQALAMYHGQTYQIA- - PE 122 



orf 91-1 .pep 
orf 91ng-l 

orf 91-1 .pep 
orf 91ng-l 

orf 91-1 .pep 
orf 91ng-l 

orf 91-1 .pep 
orf 91ng-l 
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Query: 119 VNKGGKE I V- VRAEVGI P - GQKPVNMDFTTYQSG - - GKYRTYNVAIEGTSLVTVYRNQFG 174 

G K IV +R + P G+ PV +DF ++ G ++ Y++ EG S++T +N++G 
Sbjct: 123 QPLGDKTIVPIRVTIIDPNGRPPVRLDFQWRKNSQTGNWQAYDMIAEGVSMITTKQNEWG 182 

Query: 175 EIIKAKGIDGLIAELKA 191 

+++ KGIDGL A+LK+ 
Sbjct: 183 TLLRTKG I DGLTAQLKS 199 

Based on this analysis, including the presence of a putative leader sequence in the gonococcal 
protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, 
could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 42 



The following DNA sequence was identified in N. meningitidis [<SEQ ID 355>] (SEP ID NO: 
355) : 



1 ATGAAACACA TACTCCCCCT GATTGCCGCA TCCGCACTCT GCATTTCAAC 

51 CGCTTCGGCA CATCCTGCCA GCGAACCGTC CACTCAAAAC GAAACCGCTA 

101 TGATCACGCA TACCCTCATC T CAAAATAC A GTTTTGGnnn nnnnnnnnnn 

151 nnnnnnnnnn nnGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT 

201 CGACCATCAG GAAGCCGCAC GCCGAAACGG CTTAACGATG CAGCCGGCAA 

251 AAGTCATCGT CTTCGGCACG CCCAAAGCCG GCACGCCGCT GATGGTCAAA 

301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTA CGCGTCCTCG TTACCGAAAC 

3 51 GGACGGCAAA GTACGCGCCG CCTATACCGA TACGCGCGCC CTCATCGCCG 

4 01 GCAGCCGCAT CGGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA 
4 51 AAACTGATAC AAAAAACCGT AGGCGAATAA 

This corresponds to the amino acid sequence [<SEQ ID 356; ORF97>] (SEP ID NO: 356; 
PRF97) : 



1 MKHILPLIAA SALCISTASA HPASEPSTQN ETAMITHTLI SKYSFGXXXX 
51 XXXXAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK 
101 DPAFALQLPL RVLVTETDGK VRAAYTDTRA LIAGSRIGFD EVANTLANAE 
151 KLIQKTVGE* 

Further work revealed the complete nucleotide sequence [<SEQ ID 357>] (SEP ID NO: 357) : 



1 ATGAAACACA TACTCCCCCT GATTGCCGCA TCCGCACTCT GCATTTCAAC 

51 CGCTTCGGCA CATCCTGCCA GCGAACCGTC CACCCAAAAC GAAACCGCTA 

101 TGACCACGCA TACCCTCACC TCAAAATACA GTTTTGACGA AACCGTCAGC 

151 CGCCTTGAAA CCGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT 

2 01 CGACCATCAG GAAGCCGCCC GCCGAAACGG CTTAACGATG CAGCCGGCAA 

2 51 AAGTCATCGT CTTCGGCACG CCCAAAGCCG GCACGCCGCT GATGGTCAAA 
301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTA CGCGTCCTCG TTACCGAAAC 

3 51 GGACGGCAAA GTACGCGCCG CCTATACCGA TACGCGCGCC CTCATCGCCG 

4 01 GCAGCCGCAT CGGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA 
4 51 AAACTGATAC AAAAAACCGT AGGCGAATAA 
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This corresponds to the amino acid sequence [<SEQ ID 358; ORF97-l>] (SEP ID NO: 358; 
PRF97-1) : 

1 MKHILPLIAA SALCISTASA HPASEPSTQN ETAMTTHTLT SKYSFDETVS 
51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK 
101 DPAFALQLPL RVLVTETDGK VRAAYTDTRA LIAGSRIGFD EVANTLANAE 
151 KLIQKTVGE* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N.meningitidis (strain A) 

ORF97 (SEP ID NO: 356) shows 88.7% identity over a 159aa overlap with an ORF (ORF97a) 
(SEP ID NO: 360) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 97. pep MKHILPLIAASALCISTASAHPASEPSTQNETAMITHTLISKYSFGXXXXXXXXAIKSKG 

I Mill llllllllll lllllhlllllll I I M Mill : =IIMM 
orf 97a MXH I LPLXXAS ALC I STASXHPASEPQTQNETAMTTHTLTSKYSFDETVSRLETAI KSKG 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 97 .pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK 

I 1 1 1 1 II II II II 1 1 1 1 1! 1 1 1 M 1 1 1 II II II 1 1 1 1 II II M II li 1 1 lllllll 

orf 97a MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVXVTETDGK 

70 80 90 100 110 120 

130 140 150 160 

orf 97 . pep VRAAYTDTRAL I AGSR I GFDEVANTLANAEKL I QKTVGEX 
I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I h II I 
orf 97a VRAAYTDTRAL I AGSR I GFDEVANTLANAEKLIQKTI GEX 

130 140 150 160 

The complete length PRF97a nucleotide sequence [<SEQ ID 359>] fSEPIDNP: 359) is: 

1 ATGANACACA TACTCCCCCT GANTGNCGCA TCCGCACTCT GCATTTCAAC 

51 CGCTTCGGNN CATCCTGCCA GCGAACCGCA AACCCAAAAC GAAACCGCTA 

101 TGACCACGCA TACCCTCACC TCAAAATACA GTTTTGACGA AACCGTCAGC 

151 CGCCTTGAAA CCGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT 

201 CGACCATCAG GAAGCCGCCC GCCGAAACGG CTTAACGATG CAGCCGGCAA 

251 AAGTCATCGT CTTCGGCACG CCCAAAGCCG GTACGCCGCT GATGGTCAAA 

3 01 GACCCCGCCT TCGCCCTGCA ACTGCCCCTG CGCGTCNTCG TTACCGAAAC 
351 GGACGGCAAA GTACGCGCCG CCTATACCGA TACGCGCGCC CTCATCGCCG 

4 01 GCAGCCGCAT CGGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA 
4 51 AAACTGATAC AAAAAACCAT AGGCGAATAA 

This encodes a protein having amino acid sequence [<SEQ ID 360>] (SEP ID NP: 360) : 



1 MXHILPLXXA SALCISTASX HPASEPQTQN ETAMTTHTLT SKYSFDETVS 

51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK 

101 DPAFALQLPL RVXVTETDGK VRAAYTDTRA LIAGSRIGFD EVANTLANAE 

151 KLIQKTIGE* 
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ORF97a (SEO ID NO: 360) and ORF97-1 (SEP ID NO: 358) show 95.6% identity in 159 aa 
overlap: 

10 20 30 40 50 60 

5 orf 97a . pep MXH I LPLXXAS ALC I STASXHP AS EPQTQNETAMTTHTLTS KYS FDETVSRLETAI KSKG 

I Mill MINIMI I Ml MM 1 1 1 1 1 M I II M M 1 1 1 1 1 1 II 1 1 1 1 II M I 

orf 97-1 MKH I LP L I AASALC I STASAHPASE PS TQNETAMTTHTLTS KYS FDETVSRLETAI KSKG 

10 20 30 40 50 60 

70 80 90 100 110 120 

10 or f 9 7a . pep MDI FAVIDHQEAARRNGLTMQPAKVI VFGTPKAGTPLMVKDPAFALQLPLRVXVTETDGK 

1 1 1 1 1 1 1 M M I II M M 1 1 1 1 1 M M I II 1 1 II 1 1 1 1 1 1 M I M 1 1 1 1 lllllll 

orf 97-1 MDI FAVIDHQEAARRNGLTMQPAKVI VFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK 

70 80 90 100 110 120 

130 140 150 160 

15 orf 97a. pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTIGEX 

1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 MM 

orf 97-1 VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGEX 

130 140 150 160 

Homology with a predicted ORF from N.gonorrhoeae 

20 ORF97 (SEO ID NO: 356) shows 88.1% identity over a 159aa overlap with a predicted ORF 
(ORF97.ng) (SEO ID NO: 362) from M gonorrhoeae: 

orf 97 .pep MKHILPLIAASALCISTASAHPASEPSTQNETAMITHTLISKYSFGXXXXXXXXAIKSKG 60 

MM IIIIMMMIMIMM III 1 1 1 1 Mill : MUM 

orf 97ng MKH I LP PIAASAFC I STASAHPAGKPPTQNETAMTTHTLTSKYS FDETVSRLETAI KSKG 60 

25 orf 97 .pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK 120 

M I III 1 1 1 1 1 II 1 1 1 1 1 M I M I II I M M 1 1 M I II 1 1 1 1 1 II 1 1 1 1 1 1 1 II I II II 

orf 97ng MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK 120 

orf 97 .pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGE 159 

IMIIMMIIMIMMMMIIII I I MIMM 

30 orf97ng VRTAYTDTRALIVGSRISFDEVANTLANAEKLIQKTVGE 159 

The complete length ORF97ng nucleotide sequence [<SEQ ID 361 >] (SEO ID NO: 361) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 362>] (SEO ID NO: 362) : 



1 MKHILPPIAA SAFCISTASA HPAGKPPTQN ETAMTTHTLT SKYSFDETVS 
35 51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK 

101 DPAFALQLPL RVLVTETDGK VRTAYTDTRA LIVGSRISFD EVANTLANAE 
151 KLIQKTVGE* 

Further work revealed the complete nucleotide sequence [<SEQ ID 363>] (SEO ID NO: 363) : 



40 



1 ATGAAACACA TACTCCCcct gatcgccgca TccgcactCT GCATTTCAAC 
51 CGCTTCGGCA CACCCTGCCG GCAAACCGCC CACCCAAAAC GAAACCGCTA 
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101 TGACCACGCA CACCCTCACC TCGAAATACA GTTTTGACGA AACCGTCAGC 

151 CGCCTTGAAA CCGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT 

201 CGACCATCAG GAAGCGGCAC GCCGAAACGG CCTGACCATG CAGCCGGCAA 

2 51 AAGTCATCGT CTTCGGCACG CCCAAGGCCG GTACGCCgct GATGGTCAAA 
5 301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTG CGCGTCCTCG TTACCGAAAC 

3 51 GGACGGCAAA GTACGCACCG CCTATACCGA TACGCGCGCC CTCATCGTCG 

4 01 GCAGCCGCAT CAGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA 
4 51 AAACTGATAC AAAAAACCGT AGGCGAATAA 

10 This corresponds to the amino acid sequence [<SEQ ID 364; ORF97ng-l>] (SEP ID NO: 364; 
PRF97ng-l) : 

1 MKHILPLIAA SALCISTASA HPAGKPPTQN ETAMTTHTLT SKYSFDETVS 
51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK 
101 DPAFALQLPL RVLVTETDGK VRTAYTDTRA LIVGSRISFD EVANTLANAE 
15 151 KLIQKTVGE* 

ORF97ng-l (SEP ID NO: 364) and PRF97-1 (SEP ID NO: 358) show 96.2% identity in 159 aa 
overlap: 

10 20 30 40 50 60 

20 orf 97-1 .pep MKHILPLIAASALCISTASAHPASEPSTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG 

1 1 1 1 M 1 1 1 1 1 1 1 i 1 1 . M 1 1 1 - IMIIIIIIIII I IMIIIIIIIIIIIIII 

orf 97ng-l MKH I LPL I AAS ALC I S TAS AH P AGKP PTQNETAMTTHTLTS KYS FDETVSRLETA I KS KG 

10 20 30 40 50 60 

70 80 90 100 110 120 

25 orf 97 - 1 . pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK 

1 1 1 1 1 M I Ml 1 1 1 1 1 M M 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 i I I 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 

orf 97ng-l MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK 

70 80 90 100 110 120 

130 140 150 160 

30 orf 97-1 .pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGEX 

I H I I I I I I I M I M I I I I I I I I I I I I I I I I I I I 
orf 97ng- 1 VRT A YTDTRAL I VGS R I S FDE VANTLAN AE KL I QKT VGEX 

130 140 150 160 

35 Based on this analysis, including the presence of a putative leader sequence in the gonococcal 
protein, it was predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their 
epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

PRF97-1 (SEP ID NP: 358) (15.3kDa) was cloned in pET and pGex vectors and expressed in 
Exoli, as described above. The products of protein expression and purification were analyzed by 
40 SDS-PAGE. Figures 12A & 12B show, repsectively,the results of affinity purification of the GST- 
fusion and His-fusion proteins. Purified GST-fusion protein was used to immunise mice, whose 
sera were used for Western Blot (Figure 12C), ELISA (positive result), and FACS analysis (Figure 
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12D). These experiments confirm that ORF97-1 (SEP ID NO: 358) is a surface-exposed protein, 
and that it is a useful immunogen. 

Figure 12E shows plots of hydrophilicity, antigenic index, and AMPHI regions for ORF97-1 (SEQ 
ID NO: 358) . 

Example 43 

The following DNA, believed to be complete, sequence was identified in N. meningitidis [<SEQ ID 
365>] fSEO ID NO: 365) : 

1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC AGTAAATGGC TGATTGTGCC 

51 GCTGATGCTC CCCGCCTTTC AGAATGTGGC GGCGGAGGGG ATAGATGTGA 

101 GCCGTGCCGA AGCGAGGATA ACCGACGGCG GGCAGCTTTC CATCAGCAGC 

151 CGCTTCCAAA CCGAGCTGCC CGACCAGCTC CAACAGGCGT TGCGCCGGGg 

201 CGTGCCGCTC AACTTTACCT TAAGCTGGCA GCTTTCCGCC CCGATAATCG 

251 CTTCTTATCG GTTTAAATTG GGGCAACTGA TTGGCGATGA CGACaATATT 

301 GACTACAAAC TGAGTTTCCA TCCGCTGACc AaACGCTACC GCGTTACCgT 

351 CGgCGCGTTT TCGACAGACT ACGACACCTT GGATGCGGCA TTGCGCGCGA 

401 CCGGCGCGGT TGCCAACTGG AAAGTCCTGA ACAAAGGCGC GCTGTCCGGT 

4 51 GCGGAAGCAG GGGAAACCAA GGCGGAAATC CGCCTGACGC TGTCCACTTC 

501 AAAACTGCCC AAGCCTTTTC AAATCAATGC ATTGACTTCT CAAAACTGGC 

551 ATTTGGATTC GGGTTGGAAA CCTCTAAACA TCATCGGGAA CAAATAA 

This corresponds to the amino acid sequence [<SEQ ID 366; ORF106>] (SEP ID NO: 366; 
ORF106) : 

1 MAFITRLFKS SKWLIVPLML PAFQNVAAEG IDVSRAEARI TDGGQLSISS 
51 RFQTELPDQL QQALRRGVPL NFTLSWQLSA PIIASYRFKL GQLIGDDDNI 
101 DYKLSFHPLT KRYRVTVGAF STDYDTLDAA LRATGAVANW KVLNKGALSG 
151 AEAGETKAEI RLTLSTSKLP KPFQINALTS QNWHLDSGWK PLNIIGNK* 

Further work revealed the following DNA sequence [<SEQ ID 367>] (SEP ID NO: 367) : 



1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC AGTAAATGGC TGATTGTGCC 

51 GCTGATGCTC CCCGCCTTTC AGAATGTGGC GGCGGAGGGG ATAGATGTGA 

101 GCCGTGCCGA AGCGAGGATA ACCGACGGCG GGCAGCTTTC CATCAGCAGC 

151 CGCTTCCAAA CCGAGCTGCC CGACCAGCTC CAACAGGCGT TGCGCCGGGG 

2 01 CGTGCCGCTC AACTTTACCT TAAGCTGGCA GCTTTCCGCC CCGATAATCG 
251 CTTCTTATCG GTTTAAATTG GGGCAACTGA TTGGCGATGA CGACAATATT 

3 01 GACTACAAAC TGAGTTTCCA TCCGCTGACC AACCGCTACC GCGTTACCGT 
351 CGGCGCGTTT TCGACAGACT ACGACACCTT GGATGCGGCA TTGCGCGCGA 

4 01 CCGGCGCGGT TGCCAACTGG AAAGTCCTGA ACAAAGGCGC GCTGTCCGGT 
4 51 GCGGAAGCAG GGGAAACCAA GGCGGAAATC CGCCTGACGC TGTCCACTTC 
501 AAAACTGCCC AAGCCTTTTC AAATCAATGC . ATTGACTTCT CAAAACTGGC 
551 ATTTGGATTC GGGTTGGAAA CCTCTAAACA TCATCGGGAA CAAATAA 
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This corresponds to the amino acid sequence [<SEQ ID 368; ORF106-1>] (SEP ID NO: 368; 
PRF106-1) : 

1 MAFITRLFKS SKWLIVPLML PAFQNVAAEG IDVSRAEARI TDGGQLSISS 
51 RFQTELPDQL QQALRRGVPL NFTLSWQLSA PIIASYRFKL GQLIGDDDNI 
'5 101 DYKLSFHPLT NRYRVTVGAF STDYDTLDAA LRATGAVANW KVLNKGALSG 

151 AEAGETKAE I RLTLSTSKLP KPFQINALTS QNWHLDSGWK PLNIIGNK* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with a predicted ORF from N. meningitidis (strain A) 

10 ORF106 (SEP ID NO: 366) shows 87.4% identity over a 199aa overlap with an ORF (ORF106a) 
(SEP ID NO: 370) from strain A ofN. meningitidis: 

10 20 30 40 50 59 

orf 106. pep MAFITRLFKSSK-WL I VPLMLPAFQNVAAEG IDVSRAEARI TDGGQLS I SSRFQTELPDQ 

llllllllll I Ih: II - I I I I I I I I I I I I I I : I I I I I I llllllllll 

15 orf 106a MAFITRLFKSIKQWLVLLPMLSVLPDAAAEGIDVSRAEARIXDGGQLSXXSRFQTELPDQ 

10 20 30 40 50 60 

60 70 80 90 100 110 119 

orf 106 .pep LQQALRRGVPLNFTLSWQLSAPIIASYRFKLGQLIGDDDNIDYKLSFHPLTKRYRVTVGA 

II I III II II lllllllllllll lllllllll llllllllllhllllllll 
20 orf 106a LQXAXXRGVXLNXTLXWQLSAP 1 1 AS YRFXLGQL I GDDDX IDYKLS FHPLTNRYRVTVGA 

70 80 90 100 110 120 

120 130 140 150 160 170 179 

orf 106 . pep FSTDYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT 

I I I I i I i 1 1 1 i 1 1 1 M i ' I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 MM 1 1 1 1 1 1 1 M 1 1 1 1 1 

25 orf 106a FSTXYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT 

130 140 150 160 170 180 

180 190 199 

orf 106 .pep SQNWHLDSGWKPLNI I GNKX 

IIIIIIIIIIIIIIIMIII 
30 orf 106a SQNWHLDSGWKPLNI I GNKX 

190 200 

Due to the K^N substitution at residue 111, the homology between PRF106a (SEP ID NP: 370) 
and PRF106-1 (SEP ID NP: 368) is 87.9% over the same 199 aa overlap. 



35 The complete length PRF106a nucleotide sequence [<SEQ ID 369>] (SEPIDNP: 369) is: 

1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC ATTAAACAAT GGCTTGTGCT 

51 GCTGCCGATG CTTTCCGTTT TGCCGGACGC GGCGGCGGAG GGGATAGATG 

101 TGAGCCGCGC CGAAGCGAGG ATAANCGACG GCGGGCAGCT TTCCATNAGN 

151 AGCCGCTTCC AAACCGAGCT GCCCGACCAG CTCCAANNNG CGNNGNGCCG 

40 201 GGGCGTGNCG CTCAACTNTA CCTTAAGNTG GCAGCTTTCC GCCCCGATAA 

251 TCGCTTCTTA TCGGTTTNAA TTGGGGCAAC TGATTGGCGA TGACGACNAT 
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301 ATTGACTACA AACTGAGTTT CCATCCGCTG ACCAACCGCT ACCGCGTTAC 

3 51 CGTCGGCGCG TTTTCGACAG ANTACGACAC CTTGGATGCG GCATTGCGCG 

4 01 CGACCGGCGC GGTTGCCAAC TGGAAAGTCC TGAACAAAGG CGCGCTGTCC 
4 51 GGTGCGGAAG CAGGGGAAAC CAAGGCGGAA ATCCGCCTGA CGCTGTCCAC 
501 TTCAAAACTG CCCAAGCCTT TTCAAATCAA TGCATTGACT TCTCAAAACT 
551 GGCATTTGGA TTCGGGTTGG AAACCTCTAA ACATCATCGG GAACAAATAA 

This encodes a protein having amino acid sequence [<SEQ ID 370>] (SEP ID NO: 370) : 

1 MAFITRLFKS I KQWLVLLPM LSVLPDAAAE GIDVSRAEAR IXDGGQLSXX 

51 SRFQTELPDQ LQXAXXRGVX LNXTLXWQLS APIIASYRFX LGQLIGDDDX 

101 IDYKLSFHPL TNRYRVTVGA FSTXYDTLDA ALRATGAVAN WKVLNKGALS 

151 GAEAGETKAE IRLTLSTSKL PKPFQINALT SQNWHLDSGW KPLNIIGNK* 

Homology with a predicted ORF from N. gonorrhoeae 

ORF106 fSEO ID NO: 366) shows 90.5% identity over a 199aa overlap with a predicted ORF 
(ORF106.ng) (SEP ID NO: 372) from N. gonorrhoeae: 



orf 106 .pep MAFITRLFKSSK-WLIVPLMLPAFQNVAAEGIDVSRAEARITDGGQLSISSRFQTELPDQ 59 

MINIUM I II- I -MM ::MMMMM:MMMMMMM 

orf 106ng MAFITRLFKS I KQWLVLLPILSVLPDAAAEGIAATRAEARITDGGRLS I SSRFQTELPDQ 60 

orf 106 .pep LQQALRRGVPLNFTLSWQLSAPIIASYRFKLGQLIGDDDNIDYKLSFHPLTKRYRVTVGA 119 

MMMMMMMMMMM M M M M M M M M M M M M M M M M M M I 

orf 106ng LQQALRRGVPLNFTLSWQLSAPTIASYRFKLGQLIGDDDNIDYKLSFHPLTNRYRVTVGA 120 

orf 106 .pep FSTDYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT 179 

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM 

orf 106ng FSTDYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT 180 

orf 106. pep SQNWHLDSGWKPLNIIGNK 198 

MMMMMMMMMI 

orfl06ng SQNWHLDSGWKPLNIIGNK 199 

Due to the K->N substitution at residue 111, the homology between PRF106ng (SEP ID NP: 372) 
and PRF106-1 (SEP ID NP: 368) is 91.0% over the same 199 aa overlap. 



The complete length PRF106ng nucleotide sequence [<SEQ ID 371 >] fSEPIDNP: 371) is: 



1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC ATTAAACAAT GGCTTGTGCT 

51 GTTGCCGATA CTCTCCGTTT TGCCGGACGC GGCGGCGGAG GGCATTGCCG 

101 CGACCCGCGC CGAAGCGAGG ATAACCGACG GCGGGCGGCT TTCCATCAGC 

151 AGCCGCTTCC AAACCGAGCT GCCCGACCAG CTCCAACAGG CGTTGCGCCG 

201 GGGCGTACCG CTCAACTTTA CCTTAAGCTG GCAGCTTTCC GCCCCGACAA 

2 51 TCGCTTCTTA TCGGTTTAAA TTGGGGCAAC TGATTGGCGA TGACGACAAT 

3 01 ATTGACTACA AACTAAGTTT CCATCCGCTG ACCAACCGCT ACCGCGTTAC 
351 CGTCGGCGCA TTTTCCACCG ATTACGACAC TTTGGATGCG GCATTGCGCG 

4 01 CGACCGGCGC GGTTGCCAAC TGGAAAGTCC TGAACAAAGG CGCGTTGTCC 
4 51 GGTGCGGAAG CAGGGGAAAC CAAGGCGGAA ATCCGCCTGA CGCTGTCCAC 
501 TTCAAAACTG CCCAAGCCTT TCCAAATCAA CGCATTGACT TCTCAAAACT 
551 GGCATTTGGA TTCGGGTTGG AAACCTCTAA ACATCATCGG GAACAAATAA 
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This encodes a protein having amino acid sequence [<SEQ ID 372>] (SEP ID NO: 372) : 

1 MAFITRLFKS IKQWLVLLPI LSVLPDAAAE GIAATRAEAR ITDGGRLSIS 

51 SRFQTELPDQ LQQALRRGVP LNFTLSWQLS APTIASYRFK LGQLIGDDDN 

101 IDYKLSFHPL TNRYRVTVGA FSTDYDTLDA ALRATGAVAN WKVLNKGALS 

151 GAEAGETKAE IRLTLSTSKL PKPFQINALT SQNWHLDSGW KPLNIIGNK* 

Based on this analysis, including the presence of a putative leader sequence in the gonococcal 
protein, it was predicted that the proteins from N. meningitidis and ' N. gonorrhoeae, and their 
epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

ORF106-1 (SEP ID NO: 368) (18kDa) was cloned in pET and pGex vectors and expressed in 
E.coli, as described above. The products of protein expression and purification were analyzed by 
SDS-PAGE. Figure 13A shows the results of affinity purification of the His-fusion protein, and 
Figure 13B shows the results of expression of the GST-fusion in E.coll Purified His-fusion protein 
was used to immunise mice, whose sera were used for FACS analysis (Figure 13C) These 
experiments confirm that GRF106-1 (SEP ID NP: 368) is a surface-exposed protein, and that it is 
a useful immunogen. 

Example 44 

The following DNA sequence, believed to be complete, was identified in N .meningitidis [<SEQ ID 
373>] (SEP ID NP: 373) : 

1 ATGGACACAA AAGAAATCCT CGG . TACGCG GcAGGcTCGA TCGGCAGCGC 

51 GGTTTTAGCC GTCATCATCc TGCCGCTGCT GTCGTGGTAT TTCCCCGCCG 

101 ACGACATCGG GCGCATCGTG CTGATGCAGA CGGCGGCGGG GCTgACGGTG 

151 TCGGTGTTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC 

201 CACCGCCGAC AAAGACAcCT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC 

251 TGTCTGCCGC CGCGATAGCC GCCCTGCTGC TTTCCCGCCC GTCCCTGCCG 

301 TCTGAAATCC TGTTTTCACT CGACGATGCC gCCGCCGGCa TCGGGCTGGT 

351 GCTGTTTGAA CtGAGCTTCC TGCCCATCCG cTTTCTCTTA CTGGTTTTGC 

4 01 GTATGGAAGG ACGCGCCcTT GCCTTTTCGT CCGCGCAACT CGTGCcCAAG 

451 CTCGCCATCC TGCTGCTG . T GCCGCTGACG GTCGGGCTGC TGCACTTTCC 

501 AGCGAACACC GCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG 

551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG 

601 CACGCACCGT TTTCGCCCGC CGTCCTGCAC CGGGGG . TGC GCTACGGCAT 

651 ACCGATCGCA CTGAGCAGCA TCGCCTATTG GGGGCTGGCA TCCGCCGACC 

701 GTTTGTTCCT GAAAAAATAT GCCGGCCTGG AACAGCTCGG CGTTTATTCG 

751 ATGGGTATTT CGTTCGGCGG GGCGGCATTA TTGTTCCAAA GCATCTTTTC 

801 AACGGTCTGG ACACCGTATA TTTTCCGCGC AATCGAAGAA AACGCCCCGC 

851 CCGCTCGCCT CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC 

901 GCCCTCTGC. TGACCGGCAT TTTCTCGCCC CTTGCCTCCC TCCTGCTGCC 

951 GGAAAACTAC GCCGCCGTCC GGTTTATCGT CGTATCGTGT ATG . TGCCGC 

1001 CGCTGTTTTG CACGCTGGCG GAAATCAGCG GCATCGGTTT GAACGTCGTT 
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1051 CGCAAAACGC GCCCGATCGC GCTCGCCACC TTGGGCGCGC TGGCGGCAAA 

1101 CCTGCTGCTG CTGGGGCTTG ACCGTGCCGT ACCGGCGAGG CCGCC.GGCG 

1151 CGGCGGTTGC CTGTGCCGCC TCATTCTGGC TGTTTTTTGC CTTCAAGACC 

1201 GAAAGCTCyT GCCGCCTGTG GCAGCCGCTC AAACGCCTGC CGCTTTATCT 

1251 GCACACATTG TTCTGCCTGA CCTCCTCGGC GGCCTACACC TGCTTCGGCA 

1301 CGCCGGCAAA CTATCCCCTG TTTGCCGGCG TATGGGCGGC ATATCTGGCA 

1351 GGCTGCATCC TGCGCCACCG GAAAGATTTG CACAAACTGT TTCATTATTT 

1401 GAAAAAACAA GGTTTCCCAT TATGA 

This corresponds to the amino acid sequence [<SEQ ED 374; ORF10>] (SEP ID NO: 374; 
ORF10) : 



1 MDTKEILXYA AGSIGSAVLA VIILPLLSWY FPADDIGRIV LMQTAAGLTV 

51 SVLCLGLDQA YVREYYATAD KDTLFKTLFL PPLLSAAAIA ALLLSRPSLP 

101 SEILFSLDDA AAGIGLVLFE LSFLPIRFLL LVLRMEGRAL AFSSAQLVPK 

151 LAILLLXPLT VGLLHFPANT AVLTAVYALA NLAAAAFLLF QNRCRLKAVR 

201 HAPFSPAVLH RGXRYGIPIA LSSIAYWGLA SADRLFLKKY AGLEQLGVYS 

251 MGISFGGAAL LFQSIFSTVW TPYIFRAIEE NAPPARLSAT AESAAALLAS 

3 01 ALCXTGIFSP LASLLLPENY AAVRFIWSC MXPPLFCTLA EISGIGLNW 

3 51 RKTRPIALAT LGALAANLLL LGLDRAVPAR PXGAAVACAA SFWLFFAFKT 

4 01 ESSCRLWQPL KRLPLYLHTL FCLTSSAAYT CFGTPANYPL FAGVWAAYLA 
4 51 GCILRHRKDL HKLFHYLKKQ GFPL* 

Further sequence analysis revealed the complete DNA sequence [<SEQ ID 375>] (SEP ID NO: 
375} to be: 



1 ATGGACACAA AAGAAATCCT CGGCTACGCG GCAGGCTCGA TCGGCAGCGC 

51 GGTTTTAGCC GTCATCATCC TGCCGCTGCT GTCGTGGTAT TTCCCCGCCG 

101 ACGACATCGG GCGCATCGTG CTGATGCAGA CGGCGGCGGG GCTGACGGTG 

151 TCGGTGTTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC 

201 CACCGCCGAC AAAGACACCT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC 

251 TGTCTGCCGC CGCGATAGCC GCCCTGCTGC TTTCCCGCCC GTCCCTGCCG 

3 01 TCTGAAATCC TGTTTTCACT CGACGATGCC GCCGCCGGCA TCGGGCTGGT 
351 GCTGTTTGAA CTGAGCTTCC TGCCCATCCG CTTTCTCTTA CTGGTTTTGC 

4 01 GTATGGAAGG ACGCGCCCTT GCCTTTTCGT CCGCGCAACT CGTGCCCAAG 
4 51 CTCGCCATGC TGCTGCTGCT GCCGCTGACG GTCGGGCTGC TGCACTTTCC 
501 AGCGAACACC GCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG 
551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG 
6 01 CACGCACCGT TTTCGCCCGC CGTCCTGCAC CGGGGGCTGC GCTACGGCAT 
651 ACCGATCGCA CTGAGCAGCA TCGCCTATTG GGGGCTGGCA TCCGCCGACC 
701 GTTTGTTCCT GAAAAAATAT GCCGGCCTGG AACAGCTCGG CGTTTATTCG 
751 ATGGGTATTT CGTTCGGCGG GGCGGCATTA TTGTTCCAAA GCATCTTTTC 
801 AACGGTCTGG ACACCGTATA TTTTCCGCGC AATCGAAGAA AACGCCCCGC 
851 CCGCCCGCCT CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC 
901 GCCCTCTGCC TGACCGGCAT TTTCTCGCCC CTTGCCTCCC TCCTGCTGCC 
951 GGAAAACTAC GCCGCCGTCC GGTTTATCGT CGTATCGTGT ATGCTGCCGC 

1001 CGCTGTTTTG CACGCTGGCG GAAATCAGCG GCATCGGTTT GAACGTCGTC 

1051 CGCAAAACGC GCCCGATCGC GCTCGCCACC TTGGGCGCGC TGGCGGCAAA 

1101 CCTGCTGCTG CTGGGGCTTG CCGTGCCGTC CGGCGGCGCG CGCGGCGCGG 

1151 CGGTTGCCTG TGCCGCCTCA TTCTGGCTGT TTTTTGCCTT CAAGACCGAA 

12 01 AGCTCCTGCC GCCTGTGGCA GCCGCTCAAA CGCCTGCCGC TTTATCTGCA 
1251 CACATTGTTC TGCCTGACCT CCTCGGCGGC CTACACCTGC TTCGGCACGC 

13 01 CGGCAAACTA TCCCCTGTTT GCCGGCGTAT GGGCGGCATA TCTGGCAGGC 
1351 TGCATCCTGC GCCACCGGAA AGATTTGCAC AAACTGTTTC ATTATTTGAA 

14 01 AAAACAAGGT TTCCCATTAT GA 
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This corresponds to the amino acid sequence [<SEQ ID 376; ORF10-1>] fSEO ID NO: 376; 
ORFlO-1) : 

1 MDTKEILGYA AGSIGSAVLA VIILPLLSWY FPADDIGRIV LMQTAAGLTV 

51 SVLCL GLDQA YVREYYATAD KDTLFKT LFL PPLLSAAAIA A LLLSRPSLP 

5 101 SEILFSLDDA AAGIG LVLFE LSFLPIRFLL LV LRMEGRAL AFSSAQLVPK 

151 LAILLLLPLT VGLL HFPANT AVLTAVYALA NLAAAAFL LF QNRCRLKAVR 

201 HAPFSPAVLH RGLRYGIPIA LSSIAYWGLA SADRLFLKKY AGLEQ LGVYS 

251 MGISFGGAAL LF QSIFSTVW TPYIFRAIEE NAPPARLSAT AESA AALLAS 

301 ALCLTGIFSP LA SLLLPENY AAVRFIWSC MLPPLFCTLA EISGIGLNW 

10 351 RKTRP I ALAT LGALAANLLL LGLAVPSGGA RGAAVACAAS FWLFFAFKTE 

401 SSCRLWQPLK RLPLYLHTLF CLTSSAAYTC FGTPANYPLF AGVWAAYLAG 

451 CILRHRKDLH KLFHYLKKQG FPL* 

Computer analysis of this amino acid sequence gave the following results: 
15 Prediction 

ORF10-1 (SEP ID NO: 376) is predicted to be the precursor of an integral membrane protein, 
since it comprises several (12-13) potential transmembrane segments, and a probable cleavable 
signal peptide 

Homology with EpsM (SEP ID NO: 1137) from Streptococcus thermophilics (accession number 
20 U40830). 

ORF10 (SEP ID NP: 374) shows homology with the epsM gene of 5. thermophilics, which 
encodes a protein (SEP ID NP: 1137) of a size similar to PRF10 and is involved in 
expolysaccharide synthesis. Pther homologies are with prokaryotic membrane proteins: 

Identities = (25%) 

25 Query: 213 LRYGIPLALSSLAYWGLASADRLFLKKYAGLEQLGVYSMGISFGGAALLLQSIFSTVW 270 

L Y +PL SS+ +W L ++ R F+ + G G+ ++ + +IF+ W 

Sbjct: 210 LYYALPLIPSSILWWLLNASSRYFVLFFLGAGANGLLAVATKI PS I IS I FNTI FTQAW 267 

Identities = 15/57 (26%), Positives = 31/57 (54%) 

Query: 7 LGYAAGS IGSAVLAVI I LPLLSWYFPADD IGRI VLMQTAAGLTVS VLCLGLDQAYVR 63 

30 L + G++GS +L + ++PL + + + G L QT A L + ++ + + A +R 

Sbjct: 12 LVFTIGNLGSKLLVFLLVPLYTYAMTPQEYGMADLYQTTANLLLPLITMNVFDATLR 68 

Identities = 16/96 (16%), Positives = 36/96 (37%) 

Query: 307 I FS PLASLLLPENYAAVRFTVVSCMLPPLFYTLTE ISGIGLNVVRKTRP IXXXXXXXXXX 366 
+ p+ ++ +YA+ V ML LF + ++ G ++T+ + 

35 Sbjct: 305 VLKPIVEKWSSDYASSWQYVPFFMLSMLFSSFSDFFGTNYIAAKQTKGVFMTSIYGTIV 364 
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Homology with a predicted ORF from N.meningitidis (strain A) 



ORF10 (SEP ID NO: 374) shows 95.4% identity over a 475aa overlap with an ORF (ORFlOa) 
(SEP ID NO: 378) from strain A of N. meningitidis: 



10 20 30 40 50 60 

MDTKEILXYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 

III III I I II II II MM INI MM MM III 1 1 III I III I III II II II II II II I 

MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 
10 20 30 40 50 60 

70 80 90 100 110 120 

YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 

IIIIIIMIMIIMIIMIII Mil MM IMMIIIIIIIMII MIIMIillll! 

YVREYYAAADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 
70 80 90 100 110 120 



5 orflO.pep 
orf 10a 

10 orflO.pep 
orf 10a 



130 140 150 160 170 180 

1 5 orf 10 . pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLXPLTVGLLHFPANTAVLTAVYALA 

I I I I I I I M I I I I I I I I I I I I I I I I I lllllll lllllll Mill III I I III IN 
orflOa LSFLPI RFLLLVLRMEGRALAFS SAQLVS KLA I LLLLPLTVGLLHFP ANTAVLTAVYALA 

130 140 150 160 170 180 



190 200 210 220 230 240 

20 orf 10 .pep NLAAAAFLLFQNRCRLKAVRHAPFS PAVLHRGXRYG I P I ALS S I AYWGLAS ADRLFLKKY 

1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I M MINI IIIIIIIIIIIIIIIIMIIIIIIIII 

orf 10a NLAAAAFLLFQNRCRLKAVRRAPFSSAVLHRGLRYGIPIALSSIAYWGLASADRLFLKKY 

190 200 210 220 230 240 



250 260 270 280 290 300 

25 orf 10 .pep - AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEENAPPARLSATAESAAALLAS 

MMMMMMMIMM MMMMMMMM MM 1 1 1 II I II 1 1 1 II M I II 1 1 

orf 10a AGLEQLGVYSMGI SFGGAALLFQS I FSTVWTPYI FRAI EANAPPARLSATAESAAALLAS 

250 260 270 280 290 300 



310 320 330 340 350 360 

30 orf 10 .pep ALCXTGIFSPLASLLLPENYAATOFIWSCMXPPLFCTLAEISGIGLNVVRKTRPIALAT 

III 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IIIUIMIIIIIIII I I l!M 

orf 10a ALCLTGIFSPLASLLLPENYAAVRFIWSCMLPPLFCTLVEISGIGLNWRKTRPIALAT 

310 320 330 340 350 360 



370 380 390 400 410 419 

35 orf 10 .pep LGALAANLLLLGLDRAVPAR-PXGAAVACAASFWLFFAFKTESSCRLWQPLKRLPLYLHT 

lllllillllll |||: lllllllllllllhllllllllllllllllllhll 
orf 10a LGALAANLLLLGL- -AVPSGGARGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHT 

370 380 390 400 410 



420 430 440 450 460 470 

40 orf 10 . pep LFCLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX 

1 1 M 1 1 1 1 i 1 1 1 1 M 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 I U. M ' 1 1 M 

orf 10a LFCLASSAAYTCFGTPANYPLFAGVWAVYLAGCILRHRKDLHKLFHYLKKQGFPLX 
420 430 440 450 460 470 



45 The complete length ORF1 0a nucleotide sequence [<SEQ ID 377>] (SEP ID NO: 377) is: 
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1 


ATGGACACAA 


51 


GGTTTTAGCC 


101 


ACGACATCGG 


151 


TCGGTGTTGT 


201 


CGCCGCCGAC 


251 


TGTCTGCCGC 


301 


TCTGAAATCC 


351 


GCTGTTTGAA 


401 


GTATGGAAGG 


451 


CTCGCCATCC 


501 


GGCGAACACC 


551 


CCGCCGCCTT 


601 


CGCGCACCGT 


651 


ACCGATCGCA 


701 


GTTTGTTCCT 


751 


ATGGGTATTT 


801 


AACGGTCTGG 


851 


CCGCCCGCCT 


901 


GCCCTCTGCC 


951 


GGAAAACTAC 


1001 


CGCTGTTTTG 


1051 


CGAAAAACAC 


1101 


CCTGCTGCTG 


1151 


CGGTTGCCTG 


1201 


AGCTCCTGCC 


1251 


CACATTGTTC 


1301 


CGGCAAACTA 


1351 


TGCATCCTGC 


1401 


AAAACAAGGT 



AAAGACACTT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC 

CGCGATAGCC GCCCTGCTGC TTTCCCGCCC ATCCCTGCCG 

TGTTTTCGCT CGACGATGCC GCCGCCGGCA TCGGGCTGGT 

CTGAGCTTCC TGCCCATCCG CTTTCTCTTA CTGGTTTTGC 

ACGCGCCCTT GCCTTTTCGT CCGCGCAACT CGTGTCCAAG 

10 451 CTCGCCATCC TGCTGCTGCT GCCGCTGACG GTCGGGCTGC TGCACTTTCC 

GCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG 

TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG 

TTTCATCCGC CGTCCTGCAT CGCGGCCTGC GCTACGGCAT 

CTAAGCAGCA TCGCCTATTG GGGGCTGGCA TCCGCCGACC 

15 701 GTTTGTTCCT GAAAAAATAT GCCGGCCTAG AACAGCTCGG CGTTTATTCG 

CGTTCGGCGG AGCGGCATTA TTGTTCCAAA GCATCTTTTC 

ACACCGTATA TTTTCCGCGC AATCGAAGCA AACGCCCCGC 

CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC 

TGACCGGCAT TTTCTCGCCC CTCGCCTCCC TCCTGCTGCC 

20 951 GGAAAACTAC GCCGCCGTCC GGTTTATCGT CGTATCGTGT ATGCTGCCTC 

CACGCTGGTA GAAATCAGCG GCATCGGTTT GAACGTCGTC 

GCCCGATCGC GCTCGCCACC TTGGGCGCGC TGGCGGCAAA 

CTGGGGCTTG CCGTACCGTC CGGCGGCGCG CGCGGCGCGG 

TGCCGCCTCA TTTTGGCTGT TTTTTGTTTT CAAGACCGAA 

25 1201 AGCTCCTGCC GCCTGTGGCA GCCGCTCAAA CGCCTGCCGC TTTATATGCA 

TGCCTGGCCT CCTCGGCGGC CTACACCTGC TTCGGCACTC 

CCCCCTGTTT GCCGGCGTAT GGGCGGTATA TCTGGCAGGC 

GCCACCGGAA AGATTTGCAC AAACTGTTTC ATTATTTGAA 
TTCCCATTAT GA 



30 



This encodes a protein having amino acid sequence [<SEQ ID 378>] (SEP ID NO: 378) : 



1 MDTKEILGYA AGSIGSAVLA VIILPLLSWY FPADDIGRIV LMQTAAGLTV 

51 SVLCLGLDQA YVREYYAAAD KDTLFKTLFL PPLLSAAAIA ALLLSRPSLP 

101 SEILFSLDDA AAGIGLVLFE LSFLPIRFLL LVLRMEGRAL AFSSAQLVSK 

35 151 LAILLLLPLT VGLLHFPANT AVLTAVYALA NLAAAAFLLF QNRCRLKAVR 

201 RAPFSSAVLH RGLRYGIPIA LSSIAYWGLA SADRLFLKKY AGLEQLGVYS 

251 MGISFGGAAL LFQSIFSTVW TPYIFRAIEA NAPPARLSAT AESAAALLAS 

3 01 ALCLTGIFSP LASLLLPENY AAVRFIWSC MLPPLFCTLV EISGIGLNW 

351 RKTRPIALAT LGALAANLLL LGLAVPSGGA RGAAVACAAS FWLFFVFKTE 

40 4 01 SSCRLWQPLK RLPLYMHTLF CLASSAAYTC FGTPANYPLF AGVWAVYLAG 

451 CILRHRKDLH KLFHYLKKQG FPL* 

ORFlOa (SEP ID NO: 378) and ORF10-1 (SEP ID NO: 376) show 95.4% identity in 475 aa 
overlap: 

45 10 20 30 40 50 60 

orf 10-1 .pep MDTKEILXYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 

Illllll ililllMIMIIII I IMMMIIIIIIIIIIIII i lllllllllli 

orf 10a MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 

10 20 30 40 50 60 

50 70 80 90 100 110 120 

orf 10-1 .pep YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 

I I I I I I : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I II I I I I I M M I I M I I 1 1 1 1 1 1 1 1 M 

orf 10a YVREYYAAADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 

70 80 90 100 110 120 
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130 140 150 160 170 180 

orf 10-1 .pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLXPLTVGLLHFPANTAVLTAVYALA 

1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 Mill. 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 

orf 10a LSFLPIRFLLLVLRMEGRALAFSSAQLVSKLAILLLLPLTVGLLHFPANTAVLTAVYALA 
5 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 10 - 1 . pep NLAAAAFLLFQNRCRLKAVRHAPFS PAVLHRGXRYGI P I ALSS I AYWGLASADRLFLKKY 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 MINI I IIMIIIIIIIIIIIIII I Ml 

or f 1 0a NLAAAAFLL FQNRCRLKAVRRAP FS S AVLHRGLRYG I P I ALS S I AYWGLASADRLFLKKY 

10 190 200 210 220 230 240 

250 260 270 280 290 300 

orf 10-1 .pep AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEENAPPARLSATAESAAALLAS 

1 1 1 1 i 1 1 1 1 II 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [ 1 1 1 ! 1 1 1 1 1 1 1 1 1 

O r f 1 0 a AGLEQLGVYSMG I S FGGAALL FQS I FSTVWTPY I FRA I EANAP PARLS ATAES AAALLAS 

15 250 260 270 280 290 300 

310 320 330 340 350 360 

orf 10- 1 . pep ALCXTGI FS PLASLLLPENYAAVRFI WSCMXPPLFCTLAE I SGIGLNWRKTRP I ALAT 

III 1 1 1 M li 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 M 1 1 II I 

orf 10a ALCLTG I FSPLASLLLPENYAAVRF I WSCMLPPLFCTLVE I SGIGLNWRKTRP I ALAT 

20 310 320 330 340 350 360 

370 380 390 400 410 419 

or f 1 0 - 1 . pep LGALAANLLLLGLDRAVPAR- PXGAAVACAAS FWLFFAFKTESSCRLWQPLKRLPLYLHT 

IMIIIIIIIIII llh 1 1 M 1 1 1 1 1 1 I M M I M M 1 1 1 1 1 1 1 1 M M Ml I 

orf 10a LGALAANLLLLGL- -AVPSGGARGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHT 

25 370 380 390 400 410 

420 430 440 450 460 470 

orf 10-1 .pep LFCLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX 

1 1 1 1 : 1 1 1 1 3 1 ! 1 i 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 !t 1 1 1 1 L I ! M 1 1 1 1 1 ! I 

orf 10a LFCLASSAAYTCFGTPANYPLFAGVWAVYLAGCILRHRKDLHKLFHYLKKQGFPLX 
30 420 430 440 450 460 470 

Homology with a predicted ORF from N. gonorrhoeae 

ORF10 (SEP ID NO: 374) shows 94.1% identity over a 475aa overlap with a predicted ORF 
(ORFlO.ng) fSEO ID NO: 380) from N. gonorrhoeae: 

orf 10ng . pep MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 60 

35 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 

orf lOnm MDTKEILXYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 60 

orflOng.pep YVREYYAAADKDTLFKTLFLPPLLFSAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 120 

I I I I I I : I I I I I I I I I I I I I II Hill! IIIIIIIIIIIIIIIIIMI IMIIII 
orf 10nm YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 120 

40 orf lOng . pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLLPLTVGLLHFPANTSVLTAVYALA 180 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I 
orf 10nm LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLXPLTVGLLHFPANTAVLTAVYALA 180 

orflOng.pep NLAAAAFLLFQNRCRLKAVRRAPFSPAVLHRGLRYGIPLALSSLAYWGLASADRLFLKKY 24 0 

I I I I I I I I I I I I I I I I I I M II I I I M I I I I I h I I I h I I I I I I I I I I I I I M 
45 orf 10nm NLAAAAFLLFQNRCRLKAVRHAPFS PAVLHRGXRYGI PIALS SI AYWGLASADRLFLKKY 24 0 
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AGLEQLGVYSMGI SFGGAALLLQS I FSTVWTPY I FRAI EENATPARLS ATAES AAALLAS 300 

MllilMIIIIIMIIIIMII IIIMIIIMM II IMMIIII Mill 

AGLEQLGVYSMGI SFGGAALLFQS I FSTVWTPY I FRAI EENAPP ARLS ATAES AAALLAS 300 
ALCLTGIFSPLASLLLPENYAAVRFTWSCMLPPLFYTLTEISGIGLNWRKTRPIALAT 360 

III IMIIMM MIMIIIM Mill 1 1 1 1 I Ml 1 1 1 1 1 1 1 II I II II 1 1 1 1 1 

ALCXTGIFSPLASLLLPENYAAVRFIWSCMXPPLFCTLAEISGIGLNWRKTRPIALAT 360 

370 380 390 400 410 

LGALAANLLLLGL - -AVPSGGTRGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHT 

MIMIIMM llh lllllllllllllhllllllMllllllllllhll 
LGALAANLLLLGLDRAVPAR-PXGAAVACAASFWLFFAFKTESSCRLWQPLKRLPLYLHT 

370 380 390 400 410 

420 430 440 450 460 470 

orf 10ng . pep LFCLASSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKNLHKLFHYLKKQGFPLX 

• lllhl MIMIIIIII I MIMIIIIII MM 1 1 1 1 hi 1 1 II lllllllll II 

orf 10nm LFCLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX 
420 430 440 450 460 470 

The complete length ORFlOng nucleotide sequence [<SEQ ID 379>] (SEO ID NO: 379) is: 



1 ATGGACACAA AAGAAATCCT CGGCTACGCG GCAGGCTCGA TCGGCAGCGC 

51 GGTTTTAGCC GTCATCATCC TGCCGCTGCT GTCGTGGTAT TTCcccgCCG 

101 ACGACATCGG GCGCATCGTG CTGATGCAGA CGGCGGCGGG ACTGACGGTG 

151 TCGGTATTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC 

2 01 CGCCGCCGAC AAAGACACTT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC 

2 51 TGTTTTCCGC CGCGATAGCC GCCCTGCTGC TTTCCCGCCC GTCCCTGCCG 
301 TCTGAAATCC TGTTTTCGCT CGACGATGCC GCCGCCGGCA TCGGGCTGGT 

3 51 GCTGTTTGAA CTGAGCTTCC TGCCCATCCG CTTTCTCTTA CTGGTTTTGC 

4 01 GTATGGAAGG GCGCGCCCTT GCCTTTTCGT CCGCGCAACT CGTGCCCAAA 
451 CTCGCCATTC TGCTGCTGTT GCCGCTGACG GTCGGGCTGC TGCACTTTCC 
501 GGCGAACACC TCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG 
551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG. 
601 CGCGCGCCGT TTTCGCCCGC CGTCCTGCAC CGGGGGCTGC GCTACGGCAT 
651 ACCGCTCGCA CTGAGCAGCC TTGCCTATTG GGGGCTGGCA TCCGCCGACC 
701 GTTTGTTCCT GAAAAAATAT GCGGGCCTGG AACAGCTCGG CGTTTATTCG 
751 ATGGGTATTT CGTTCGGCGG GGCGGCATTA TTGCTCCAAA GCATCTTTTC 
801 AACGGTCTGG ACACCGTATA TTTTCCGTGC AATCGAAGAA AACGCCACGC 
851 CCGCCCGCCT CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC 
901 GCCCTCTGCC TGACCGGAAT TTTCTCGCCC CTCGCCTCCC TCCTGCTGCC 
951 GGAAAACTAC GCCGCCGTCC GGTTTACGGT CGTATCGTGT ATGCTGccgc 

1001 cgctGTTTTA CACGCTGACC GAAATCAGCG GCATCGGTTT GAACGTCGTC 

1051 CGCAAAACGC GTCCGATCGC GCTTGCCACC TTGGGCGCGC TGGCGGCAAA 

1101 CCTGCTGCTG CTGGGGCTTG CCGTACCGTC CGGCGGCACG CGCGGCGCGG 

1151 CGGTTGCCTG TGCCGCCTCA TTCTGGTTGT TTTTTGTTTT CAAGACAGAA 

1201 AGCTCCTGCC GCCTGTGGCA GCCGCTCAAA CGCCTGCCGC TTTATATGCA 

1251 CACATTGTTC TGCCTgGCCT CCTCGGCGGC CTACACCTGC TTCGGCACAC 

13 01 CGGCAAACTA CCCcctgttt gccggcgtAT GGGCGGCATA TCTGGCAGGC 

1351 TGCATCCTGC GCCACCGGAA AAATTTGCAC AAACTGTTTC ATTATTTGAA 

1401 AAAACAAGGT TTCCCATTAT GA 

This encodes a protein having amino acid sequence [<SEQ ID 380>] (SEO ID NO: 380) : 



1 MDTKEILGYA AGSIGSAVLA VIILPLLSWY FPADDIGRIV LMQTAAGLTV 

51 SVLCLGLDQA YVREYYAAAD KDTLFKTL FL PPLLFSAAIA ALLL SRPSLP 

101 SEILFSLDDA AAGIGLVLFE LSFLPIRFLL LVLRMEGRAL AFSSAQLVPK 

151 LAILLLLPLT VGLLHFPANT SVLTAVYALA NLAAAAFLLF QNRCRLKAVR 



orf lOng.pep 
orf lOnm 
orf lOng.pep 
orf lOnm 

orf lOng.pep 
orf lOnm 
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201 RAPFSPAVLH RGLRYGIPLA LSSLAYWGLA SADRLFLKKY AGLEQLGVYS 

251 MGISFGGAAL LLQSIFSTVW TPYIFRAIEE NATPARLSAT AESAAALLAS 

3 01 ALCLTGIFSP LASLLLPENY AAVRFTWSC MLPPLFYTLT EISGIGLNW 

3 51 RKTRPI ALAT LGALAANLLL LGLA VPSGGT RGAAVACAAS FWLFFVFKTE 
5 4 01 SSCRLWQPLK RLPLYMHTLF CLASS AAYTC FGTPANYPLF AGVWAAYLAG 

4 51 CILRHRKNLH KLFHYLKKQG FPL* 

ORFlOng (SEP ID NO: 380) and ORF10-1 (SEP ID NO: 376) show 96.4% identity in 473 aa 
overlap: 

10 10 20 30 40 50 60 

orf 10 - 1 . pep MDTKEILGYAAGSIGSAVLAVI ILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 H M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i I 

orf 10ng- 1 MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 

10 20 30 40 50 60 

15 70 80 90 100 110 120 

orf 10-1 .pep YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 

MINI Mil MINIMI MM : 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 

orf 10ng-l YVREYYAAADKDTLFKTLFLPPLLFSAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 

70 80 90 100 110 120 

20 130 140 150 160 170 180 

orf 10-1 .pep LS FLP IRFLLLVLRMEGRALAFSSAQLVPKLAI LLLLPLTVGLLHFPANTAVLTAVYALA 

1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 II 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h I ■ 1 1 1 1 M I 

orf 10ng-l LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLLPLTVGLLHFPANTSVLTAVYALA 

130 140 150 160 170 180 

25 190 200 210 220 230 240 

or f 10 - 1 . pep NLAAAAFLL FQNRCRLKAVRHAP FS P AVLHRGLR YG I P I ALS S I AYWGLAS ADRL FLKKY 
llllll lllllllillMMIIMI IIIIIIIIIMIIMIIIIIII lllllll 
orfl0ng-l NLAAAAFLL FQNRCRLKAVRRAP FS P AVLHRGLR YG I PLALS SLAYWGLAS ADRLFLKKY 

190 200 210 220 230 240 

30 250 260 270 280 290 300 

orf 10-1. pep AGLEQLGVYSMGI SFGGAALLFQS I FSTVWTPY I FRAIEENAPPARLS ATAES AAALLAS 

IMNIIIIIIIIIIIMIIMII I MIIMIIIIIIM 1 1 M 1 1 1 1 1 1 M 1 1 1 

orf 10ng- 1 AGLEQLGVYSMGISFGGAALLLQSIFSTVWTPYIFRAIEENATPARLSATAESAAALLAS 
250 260 270 280 290 300 

35 310 320 330 340 350 360 

orf 10-1 .pep ALCLTG I FS PLASLLLPENYAAVRF I WS CMLPPLFCTLAE I SG I GLNWRKTRP I ALAT 

II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MINIUM IMIIIIIIIM I lllllll 

orf 10ng-l ALCLTGIFSPLASLLLPENYAAVRFTWSCMLPPLFYTLTEISGIGLNWRKTRPIALAT 
310 320 330 340 350 360 

40 370 380 390 400 410 420 

orf 10-1 .pep LGALAANLLLLGLAVPSGGARGAAVACAAS FWLFFAFKTESSCRLWQPLKRLPLYLHTLF 
I I I , I I I I I I I I I I I I I I : I I I I I . I I I Ml I I h I I M I I I I I I I I I I I I I hi I I I 
orf 10ng-l LGALAANLLLLGLAVPSGGTRGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHTLF 

370 380 390 400 410 420 

45 430 440 450 460 470 

orf 10 - 1 . pep CLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX 

1 1 ^ 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 = 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 

orf 10ng-l CLASSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKNLHKLFHYLKKQGFPLX 
430 440 450 460 470 
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Based on this analysis, including the presence of a putative leader peptide and several 
transmembrane segments and the presence of a leucine-zipper motif (4 Leu residues spaced by 6 
aa, shown in bold), it is predicted that these proteins from N .meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 45 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 38 1>] (SEP ID 
NO: 381) : 

1 . . ATCCTGAAAC CGCATAACCA GCTTAAGGAA GACATCCAAC CTGATCCGGC 

51 CGATCAAAAC GCCTTGTCCG AACCGGATGC TGCGACAGAG GCAGAGCAGT 

101 CGGATGCGGA AAATGCTGCC GACAAGCAGC CCGTTGCCGA TAAAGCCGAC 

151 GAGGTTGAAG AAAAGGCGGG CGAGCCGGAA CGGGAAGAGC CGGACGGACA 

2 01 GGCAGTGCGT AAGAAAGCGC TGACGGAAGA GCGTGAACAA ACCGTCAGGG 
251 AAAAAGCGCA GAAGAAAGAT GCCGAAACGG TTAAAATACA AGCGGTAAAA 

3 01 CCGTCTAAAG AAACAGAGAA AAAAGCTTCA AAAGAAGAGA AAAAGGCGGC 
351 GAAGGAAAAA GTTGCACCCA AACCAACCCC GGAACAAATC CTCAACAGCG 

4 01 GCAgCATCGA AAAmGCGCGC AgTGCCGCCG CCAAAGAAGT GCAGAAAATG 
4 51 AA . AACGTCC GACAAGGCGG AAGC.AACGC ATTATCTGCA AATGGGCGCG 
501 TATGCCGACC GTCAGAGCGC GGAAGGGCAG CGTGCCAAAC TGGCAATCTT 
551 GGGCATATCT TCCAAGGTGG TCGGTTATCA GGCGGGACAT AAAACGCTTT 
601 ACCGGGTGCA AAGCGGCAAT ATGTCTGCCG ATGCGGTGA 

This corresponds to the amino acid sequence [<SEQ ID 382; ORF65>] (SEP ID NO: 382; 
ORF65V. 



1 . . ILKPHNQLKE DIQPDPADQN ALSEPDAATE AEQSDAENAA DKQPVADKAD 

51 EVEEKAGEPE REEPDGQAVR KKALTEEREQ TVREKAQKKD AETVKIQAVK 

101 PSKETEKKAS KEEKKAAKEK VAPKPTPEQI LNSGSIEXAR SAAAKEVQKM 

151 XNVRQGGSXR IICKWARMPT VRARKGSVPN WQSWAYLPRW SVIRRDIKRF 

201 TGCKAAICLP MR* 

Further work revealed the complete nucleotide sequence [<SEQ ID 383>] (SEP ID NO: 383) : 

1 ATGTTTATGA ACAAATTTTC CCAATCCGGA AAAGGTCTGT CCGGTTTTTT 

51 CTTCGGTTTG ATACTGGCGA CGGTCATTAT TGCCGGTATT TTGTTTTATC 

101 TGAACCAGAG CGGTCAAAAT GCGTTCAAAA TCCCGGCTTC GTCGAAGCAG 

151 CCTGCAGAAA CGGAAATCCT GAAACCGAAA AACCAGCCTA AGGAAGACAT 

201 CCAACCTGAA CCGGCCGATC AAAACGCCTT GTCCGAACCG GATGCTGCGA 

251 CAGAGGCAGA GCAGTCGGAT GCGGAAAAAG CTGCCGACAA GCAGCCCGTT 

301 GCCGATAAAG CCGACGAGGT TGAAGAAAAG GCGGGCGAGC CGGAACGGGA 

351 AGAGCCGGAC GGACAGGCAG TGCGTAAGAA AGCGCTGACG GAAGAGCGTG 

4 01 AACAAACCGT CAGGGAAAAA GCGCAGAAGA AAGATGCCGA AACGGTTAAA 

451 AAACAAGCGG TAAAACCGTC TAAAGAAACA GAGAAAAAAG CTTCAAAAGA 

501 AGAGAAAAAG GCGGCGAAGG AAAAAGTTGC ACCCAAACCA ACCCCGGAAC 

551 AAATCCTCAA CAGCGGCAGC ATCGAAAAAG CGCGCAGTGC CGCCGCCAAA 

601 GAAGTGCAGA AAATGAAAAC GTCCGACAAG GCGGAAGCAA CGCATTATCT 

651 GCAAATGGGC GCGTATGCCG ACCGTCAGAG CGCGGAAGGG CAGCGTGCCA 

701 AACTGGCAAT CTTGGGCATA TCTTCCAAGG TGGTCGGTTA TCAGGCGGGA 
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751 CATAAAACGC TTTACCGGGT GCAAAGCGGC AATATGTCTG CCGATGCGGT 
801 GAAAAAAATG CAGGACGAGT TGAAAAAACA TGAAGTCGCC AGCCTGATCC 
851 GTTCTATCGA AAGCAAATAA 

5 This corresponds to the amino acid sequence [<SEQ ID 384; ORF65-l>] (SEP ID NO: 384; 
ORF65-1) : 

1 MFMNKFSQSG KGLSG FFFGL ILATVIIAGI LF YLNQSGQN AFKIPASSKQ 

51 PAETEILKPK NQPKEDIQPE PADQNALSEP DAATEAEQSD AEKAADKQPV 

101 ADKADEVEEK AGEPEREEPD GQAVRKKALT EEREQTVREK AQKKDAETVK 

10 151 KQAVKPSKET EKKASKEEKK AAKEKVAPKP TPEQILNSGS IEKARSAAAK 

201 EVQKMKTSDK AEATHYLQMG AYADRQSAEG QRAKLAILGI SSKWGYQAG 

251 HKTLYRVQSG NMSADAVKKM QDELKKHEVA SLIRSIESK* 

Computer analysis of this amino acid sequence gave the following results: 

15 Homology with a predicted ORF from N. meningitidis (strain A) 

ORF65 (SEP ID NO: 382) shows 92.0% identity over a 150aa overlap with an ORF (ORF65a) 
(SEP ID NO: 386) from strain A of N. meningitidis: 

10 20 30 

orf65 pep ILKPHNQLKEDIQPDPADQNALSEPDAATE 

20 Nihil III 1 1 hi M IIIMII II I 

orf 65a IIAGILF YLNQSGQNAFKIPVPSKQPAETEILKPKNQPKEDIQPEPADQNALSSPDAAKE 
30 40 50 60 70 80 

40 50 60 70 80 90 

orf 65 .pep AEQSDAENAADKQPVADKADEVEEKAGEPEREEPDGQAVRKKALTEEREQTVREKAQKKD 

25 llllllhllllllllllllllllM llllh IMIIIIIIIIIIIIIII IIIMII 

orf 65a AEQSDAEKAADKQPVADKADEVEEKADEPEREKSDGQAVRKKALTEEREQTVGEKAQKKD 
90 100 110 120 130 140 

100 110 120 130 140 150 

orf 65 . pep AETVKI QAVKPS KETEKKASKEEKKAAKEKVAPKPTPEQ I LNSGS I EXARS AAAKEVQKM 

30 Mill I INI II MM II 'Ml II I II II II II II II II II II I II II II II II I 

orf 65a AETVKKQAVKPSKETEKKASKEEKKAEKEKVAPKPTPEQILNSGSIEKARSAAAKEVQKM 
150 160 170 180 190 200 - 



35 



160 170 180 190 200 210 

orf 65 .pep XNVRQGGSXRIICXWARMPTVRARKGSVPNWQSWAYLPRWSVIRRDIKRFTGCKAAICLP 

or f 6 5a KTPDKAEATHYLQMGAYADRRSAEGQRAKLAILGISSKWGYQAGHKTLYRVQSGNMSAD 
210 220 230 240 250 260 

The complete length PRF65a nucleotide sequence [<SEQ ID 385>] (SEP ID NP: 385) is: 

40 1 ATGTTTATGA ACAAATTTTC CCAATCCGGA AAAGGTCTGT CCGGTTTTTT 

51 CTTCGGTTTG ATACTGGCGA CGGTCATTAT TGCCGGTATT TTGTTTTATC 

101 TGAACCAGAG CGGTCAAAAT GCGTTCAAAA TCCCGGTTCC GTCGAAGCAG 

151 CCTGCAGAAA CGGAAATCCT GAAACCGAAA AACCAGCCTA AGGAAGACAT 

201 CCAACCTGAA CCGGCCGATC AAAACGCCTT GTCCGAACCG GATGCTGCGA 
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2 51 AAGAGGCAGA GCAGTCGGAT GCGGAAAAAG CTGCCGACAA GCAGCCCGTT 

3 01 GCCGACAAAG CCGACGAGGT TGAGGAAAAG GCGGACGAGC CGGAGCGGGA 
351 AAAGTCGGAC GGACAGGCAG TGCGCAAGAA AGCACTGACG GAAGAGCGTG 
401 AACAAACCGT CGGGGAAAAA GCGCAGAAGA AAGATGCCGA AACGGTTAAA 

5 4 51 AAACAAGCGG TAAAACCATC TAAAGAAACA GAGAAAAAAG CTTCAAAAGA 

501 AGAGAAAAAG GCGGAGAAGG AAAAAGTTGC ACCCAAACCG ACCCCGGAAC 

551 AAATCCTCAA CAGCGGCAGC ATCGAAAAAG CGCGCAGTGC CGCTGCCAAA 

601 GAAGTGCAGA AAATGAAAAC GCCCGACAAG GCGGAAGCAA CGCATTATCT 

651 GCAAATGGGC GCGTATGCCG ACCGCCGGAG CGCGGAAGGG CAGCGTGCCA 

10 701 AACTGGCAAT CTTGGGCATA TCTTCCAAGG TGGTCGGTTA TCAGGCGGGA 

751 CATAAAACGC TTTACCGGGT GCAAAGCGGC AATATGTCTG CCGATGCGGT 

801 GAAAAAAATG CAGGACGAGT TGAAAAAACA TGAAGTCGCC AGCCTGATCC 

851 GTTCTATCGA AAGCAAATAA - 

15 This encodes a protein having amino acid sequence [<SEQ ID 386>] fSEQ ID NO: 386) : 

1 MFMNKFSQSG KGLSG FFFGL ILATVIIAGI LF YLNQSGQN AFKIPVPSKQ 

51 PAETEILKPK NQPKEDIQPE PADQNALSEP DAAKEAEQSD AEKAADKQPV 

101 ADKADEVEEK ADEPEREKSD GQAVRKKALT EEREQTVGEK AQKKDAETVK 

151 KQAVKPSKET EKKASKEEKK AEKEKVAPKP TPEQILNSGS IEKARSAAAK 

20 201 EVQKMKTPDK AEATHYLQMG AYADRRSAEG QRAKLAILGI SSKWGYQAG 

251 HKTLYRVQSG NMSADAVKKM QDELKKHEVA SLIRSIESK* 

ORF65a (SEP ID NO: 386) and ORF65-1 (SEP ID NO: 384) show 96.5% identity in 289 aa 
overlap: 



25 



10 20 30 40 50 60 

orf 65a. pep MFMNKFSQSGKGLSGFFFGL I LATVI I AGILFYLNQSGQNAFKIPVPSKQ PAETEILKPK 
i I I I I I I I M I I I ! I I I I I I I I I ! I I I I I I I I I ! I I I I I I I I: t I I I I I 1 Mill 
orf 65-1 MFMNKFSQSGKGLSGFFFGL I LATVI IAGILFYLNQSGQNAFKI PAS SKQ PAETEILKPK 

10 20 30 40 50 60 
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70 80 90 100 110 120 

orf 65a . pep NQPKEDIQPEPADQNALSEPDAAKEAEQSDAEKAADKQPVADKADEVEEKADEPEREKSD 

II MIMIMIMIIM Mill 1 1 II II II II II III MM I II 1 1 1 1 11111= I 

orf 65-1 NQPKEDIQPEPADQNALSEPDAATEAEQSDAEKAADKQPVADKADEVEEKAGEPEREEPD 

70 80 90 100 . 110 120 
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130 140 150 160 170 180 

orf 65a . pep GQAVRKKALTEEREQTVGEKAQKKDAETVKKQAVKPSKETEKKASKEEKKAEKEKVAPKP 

III II II III Mil Ml IMMII III 1 1 II II 1 1 Ml MM Mill! I Mill III 

or f 6 5 - 1 GQAVRKKALTEEREQTVREKAQKKDAETVKKQAVKPSKETEKKASKEEKKAAKEKVAPKP 

130 140 150 160 170 180 
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190 200 210 220 230 240 

orf 65a . pep TPEQI LNSGS I EKARS AAAKEVQKMKTPDKAEATHYLQMGAYADRRSAEGQRAKLAI LGI 

! II IMIII INI 1 1 1 1 II MM III 1 1 II 1 1 1 1 1 M I II 1 1 MM 1 1 1 1 1 1 1 1 1 1 1 

orf 65-1 TPEQI LNSGS IEKARSAAAKEVQKMKTSDKAEATHYLQMGAYADRQSAEGQRAKLAILG I 

190 200 210 220 230 240 



45 



250 260 270 280 290 

orf 65a. pep S S KWGYQAGHKTLYRVQSGNMS ADAVKKMQDELKKHE VASL I RS I ES KX 
II II I II I II I I I II I I I I M I I M I I I M I M I M I I M II I I I I II I 
or f 6 5 - 1 SS KWG YQAGHKTL YRVQS GNMS ADAVKKMQDELKKHE VAS L I RS I ES KX 

250 260 270 280 290 
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Homology with a predicted ORF from N.sonorrhoeae 

ORF65 (SEP ID NO: 382) shows 89.6% identity over a 212aa overlap with a predicted ORF 
(ORF65.ng) (SEP ID NO: 388) from N. gonorrhoeae: 

30 40 50 60 70 80 

5 ORF65ng IIAGILLYLNQGGQNAFKIPAPSKQPAETEILKLKNQPKEDIQPEPADQNALSEPDVAKE 

III :|| I I I I I I : M M I I I I I > : I I 
ORF65 ILKPHNQLKEDIQPDPADQNALSEPDAATE 

10 20 30 

90 100 110 120 130 140 

1 0 ORF6 5ng AEQSDAEKAADKQPVADKADEVEEKAGEPEREEPDGQAVRKKALTEEREQTVREKAQKKD 

■ I ! 1 1 1 - i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M I I I I I I I I I II I I i I I I I ' I M 

ORF65 AEQSDAENAADKQPVADKADEVEEKAGEPEREEPDGQAVRKKALTEEREQTVREKAQKKD 
40 50 60 70 80 90 

150 160 170 180 190 200 

1 5 ORF65ng AETVKKKAVKPSKETEKKASKEEKKAAKEKVAPKPTPEQILNSRSIEKARSAAAKEVQKM 

Mill H III IMIIII lllllll IMMIIIIMMIMM III llllllllllll 

ORF65 AETVKIQAVKPSKETEKKASKEEKKAAKEKVAPKPTPEQILNSGSIEXARSAAAKEVQKM 
100 110 120 130 140 150 

210 220 230 240 250 260 

20 ORF65ng KNFGQGGSQRIICKWARMPNPGARKGSVPNWQSWAYLPKWSAIRRDIKRFTACKAAICPP 

I MM MINIMI: 1 1 1 1 1 1 II II M 1 1 1 h M • II 1 1 1 1 1 1 h I II 1 1! I 

ORF65 XNVRQGGSXRI I CKWARMPTVRARKGS VPNWQS WAYLPRWS V I RRD I KRFTGCKAA I CLP 

160 170 180 190 200 210 
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ORF65ng MR 
II 

ORF65 MR 

An PRF65ng nucleotide sequence [<SEQ ID 387>] (SEP ID NP: 387) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 388>] (SEP ID NP: 388) : 

30 1 MFMNKFSQSG KGLSGFFFGL ILATVIIAGI LLYLNQGGQN AFKIPAPSKQ 

51 PAETEILKLK NQPKEDIQPE PADQNALSEP £)VAKEAEQSD AEKAADKQPV 

101 ADKADEVEEK AGEPEREEPD GQAVRKKALT EEREQTVREK AQKKDAETVK 

151 KKAVKPSKET EKKASKEEKK AAKEKVAPKP TPEQILNSRS IEKARSAAAK 

201 EVQKMKNFGQ GGSQRIICKW ARMPNPGARK GSVPNWQSWA YLPKWSAIRR 

35 251 DIKRFTACKA AICPPMR* 

After further analysis, the complete gonococcal DNA sequence [<SEQ ID 389>] (SEP ID NP: 
389) was found to be: 

1 ATGTTTATGA ACAAATTTTC CCAATCCGGA AAAGGTCTGT CCGGTTTCTT 

40 51 CTTCGGTTTG ATACTGGCAA CGGTCATTAT TGCCGGTATT TTGCTTTATC 

101 TGAACCAGGG CGGTCAAAAT GCGTTCAAAA TCCCGGCTCC GTCGAAGCAG 

151 CCTGCAGAAA CGGAAATCCT GAAACTGAAA AACCAGCCTA AGGAAGACAT 

2 01 CCAACCTGAA CCGGCCGATC AAAACGCCTT GTCCGAACCG GATGTTGCGA 

251 AAGAGGCAGA GCAGTCGGAT GCGGAAAAAG CTGCCGACAA GCAGCCCGTT 
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10 



301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 



GCCGACAAag 
aGAGCCGGAC 
AACAAACcgt 
AAacaaGCgg 
agagaaaaag 
aaatcctcaa 
gaAgtgcaGA 
CTGcaaatgg 
ccaaACtggc 
GGACATAAAA 
gGTGAAAAAA 
TCCGTGcgAT 



ccgacgAGGT 
ggACAGGCAG 
cagggAAAAA 
tAaaaccgtc 
gcggcgaaag 
cagccgCagc 
AAatgaaaaa 
gcgcgtatgc 
aAtcttgGgc 
CGCTTTACCG 
ATGCAGGACG 
TGAAGGCAAA 



TGAAGAAAag 
TGCGCAAGAA 
GCGCagaaga 
tAAAGAAACa 
aaaAAGttgc 
atcgaaaaag 
ctTtgggcaa 
cgaccgtccg 
atatctTccg 
CGTGCAAagc 
AGTTGAAAAA 
TAA 



GcGGgcgAgc 
AGCACTGAcg 
AAGATGCCGA 
gagaaaaaag 
acccaaaccg 
cgcgtagtgc 
ggcgGaagcc 
gagcgcggaA 
aagtggtcgG 
GGCAatatgt 
GCATGGGGtt 



cggaACGGga 
gAAGAgcGTG 
AACGgTTAAA 
cTtcaaaaga 
accccggaaC 
cgctgccaaa 
aacgcattaT 
gggcagcgtg 

CTATCAGGCG 
ccgccgatgc 
gcCAGCCTGA 



This encodes the following amino acid sequence [<SEQ ID 390>] (SEP ID NO: 390) : 



15 



20 



1 MFMNKFSQSG 

51 PAETEILKLK 

101 ADKADEVEEK 

151 KQAVKPSKET 

201 EVQKMKNFGQ 

251 GHKTLYRVQS 



KGLSGFFFGL ILATVIIAGI 



NQPKEDIQPE 
AGEPEREEPD 
EKKASKEEKK 
GGSQRIICKW 
GNMSADAVKK 



PADQNALSEP 
GQAVRKKALT 
AAKEKVAPKP 
ARMPTVRSAE 
MQDELKKHGV 



LLYLNQGGQN 
DVAKEAEQSD 
EEREQTVREK 
TPEQILNSRS 
GQRAKLAILG 
ASLIRAIEGK 



AFKIPAPSKQ 
AEKAADKQPV 
AQKKDAETVK 
IEKARSAAAK 
ISSEWGYQA 



PRF65ng-l (SEP ID NO: 390) and ORF65-1 (SEP ID NO: 384) show 89.0% identity in 290 aa 
overlap: 
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10 20 30 40 50 60 

orf65-l .pep MFMNKFSQSGKGLSGFFFGLILATVIIAGILFYLNQSGQNAFKIPASSKQPAETEILKPK 

IIIIIIIIIIIIIIIIIMIIIMIMMIhlllhlllllllll Ml I hi I 

orf65ng-l MFMNKFSQSGKGLSGFFFGLILATVIIAGILLYLNQGGQNAFKIPAPSKQ PAETEILKLK 

10 20 30 40 50 60 



30 



70 80 90 100 110 120 

orf 65- 1 . pep NQPKEDIQPEPADQNALSEPDAATEAEQSDAEKAADKQPVADKADEVEEKAGEPEREEPD 

II 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 M IIIIIIIIII.I.MII I hi Mill III 

orf65ng-l NQPKEDIQPEPADQNALSEPDVAKEAEQSDAEKAADKQPVADKADEVEEKAGEPEREEPD 

70 80 90 100 110 120 



35 



130 140 150 160 170 180 

orf65-l .pep GQAVRKKALTEEREQTVREKAQKKDAETVKKQAVKPSKETEKKASKEEKKAAKEKVAPKP 

IIIIIIIIIIMIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIII 

orf 65ng-l GQAVRKKALTEEREQTVREKAQKKDAETVKKQAVKPSKETEKKASKEEKKAAKEKVAPKP 

130 140 150 160 170 180 



40 



190 200 210 220 230 239 

orf 65-1 .pep TPEQILNSGSIEKARSAAAKEVQKMKTSDKAEATHYL-QMGAYADRQSAEGQRAKLAILG 

MINIM ! I I I I M I I I M I I I I h :: : : :. : : : MIIIIIIIIMII 
orf 65ng-l TPEQILNSRS I EKARSAAAKEVQ KM KNFGQGGSQRIICKWARMPTVRSAEGQRAKLAILG 

190 200 210 220 230 240 



45 



240 250 260 270 280 290 

orf 65-1 .pep I S S KWGYQAGHKTLYRVQSGNMS ADAVKKMQDELKKHEVAS L I RS I ES KX 

Mhl 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ' M M ! 1 1 IMMMIM 

orf65ng-l I SSEWGYQAGHKTLYRVQSGNMS ADAVKKMQDELKKHGVASL I RA I EGKX 

250 260 270 280 290 
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On this basis, including the presence of a putative transmembrane domain in the gonococcal 
protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, 
could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 46 

The following DNA sequence, believed to be complete, was identified in N .meningitidis [<SEQ ID 
391 >1 (SEP ID NO: 391) : 

1 ATGAACCACG ACATCACTTT CCTCACCCTG TTCCTACTCG GTkTCTTCGG 

51 CGGAAcGCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GcGTTTGs . s 

101 TCCAACTCCC CCCGCATATC AACCGCTTTT GGCTGATCCT GCTGCTTAAC 

151 ACAGGACGGG TAAGCAGCTA TACGGCAAtC GGCCTGATAC TCGGATTAAT 

201 CGGACAGGTC GGCGTTTCAC TCGAcCAaAC CCGCGTCCTG CAGAATATTT 

251 TATACACGGC CGCCAACCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC 

301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAaATCGGCA AACCGATATG 

3 51 GCGGAACCTG AACCCGATAC TCAACCGGCT GTTACCCATA AAATCCATAC 

4 01 CCGCCTGCCT tGCGgTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTG 
4 51 GTTTACAGCG CGTCGCTTTA CGCGCTGGGA AgCGGTAGTG CGGCAACGGG 
501 CGGGTTATAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTtTAG 
551 CAATCGGCAT TTTtTCCCTG CAACTGAAwA AAATCATGCA AAACCGATAT 
601 ATCCGCCTGT GTACGGGATT ATCCGTATCA TTATGGGCAT TATGGAAACT 
651 TGCCGTCCTG TGGCTGTAA 

This corresponds to the amino acid sequence [<SEQ ID 392; ORF103>] (SEP ID NO: 392: 
ORF103): 



1 MNHDITFLTL FLLGXFGGTH CIGMCGGLSS AFXXQLPPHI NRFWLILLLN 

51 TGRVSSYTAI GLILGLIGQV GVS LDQTRVL QNILYTAANL LLLFLGLYLS 

101 GISSLAAKIE KIGKPIWRNL NPILNRLLPI KSIPACLAVG ILWGWLPCGL 

151 VYSASLYALG SGSAATGGLY MLAFALGTLP NLLAIGIFSL QLXKIMQNRY 

201 IRLCTGLSVS LWALWKLAVL WL* 

Further work elaborated the DNA sequence [<SEQ ID 393>] (SEP ID NP: 393) as: 



1 ATGAACCACG ACATCACTTT CCTCACCCTG TTCCTACTCG GTTTCTTCGG 

51 CGGAACGCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GCGTTTGCGC 

101 TCCAACTCCC CCCGCATATC AACCGCTTTT GGCTGATCCT GCTGCTTAAC 

151 ACAGGACGGG TAAGCAGCTA TACGGCAATC GGCCTGATAC TCGGATTAAT 

201 CGGACAGGTC GGCGTTTCAC TCGACCAAAC CCGCGTCCTG CAGAATATTT 

251 TATACACGGC CGCCAACCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC 

301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAAATCGGCA AACCGATATG 

351 GCGGAACCTG AACCCGATAC TCAACCGGCT GTTACCCATA AAATCCATAC 

4 01 CCGCCTGCCT TGCGGTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTG 

4 51 GTTTACAGCG CGTCGCTTTA CGCGCTGGGA AGCGGTAGTG CGGCAACGGG 

501 CGGGTTATAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTTTAG 

551 CAATCGGCAT TTTTTCCCTG CAACTGAAAA AAATCATGCA AAACCGATAT 

601 ATCCGCCTGT GTACGGGATT ATCCGTATCA TTATGGGCAT TATGGAAACT 

651 TGCCGTCCTG TGGCTGTAA 
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This corresponds to the amino acid sequence [<SEQ ID 394; PRF103-1>] (SEP ID NO: 394; 
ORF103-n : 



1 MNHDITFLTL FLLGFFGGTH CIGMCGGLSS AFALQLPPHI NRFWLILLLN 

51 TGRVSSY TAI GLILGLIGQV GVSL DQTRVL QNILYTAAN L LLLFLGLYLS 

101 GISSLAAKIE KIGKPIWRNL NPILNRLLPI KSIP ACLAVG ILWGWLPCGL 

151 VYSASLYALG SGSAATGGLY M LAFALGTLP NLLAIGIF SL QLKKIMQNRY 

201 IRLCTGLSVS LWALWKLAVL WL* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N.meningitidis (strain A) 



ORF103 (SEP ID NO: 392) shows 93.8% identity over a 222aa overlap with an ORF (ORF103a) 
(SEP ID NO: 396) from strain A of TV. meningitidis: 



10 20 30 40 50 60 

MNHDITFLTLFLLGXFGGTHCIGMCGGLSSAFXXQLPPHINRFWLILLLNTGRVSSYTAI 

II MINIUM MIIIMMMIIMM MINIM IMIIMIMMMIM 

MNXD I TFLTLFLLGFFGGTHC I GMCGGLS S AFALQLPPH I NRXWL I LLLNTGRVSS YTAI 
10 20 30 40 50 60 

70 80 90 100 110 120 

GLILGLIGQVGVSLDQTRVLQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL 

MMIIMMMMIIMI MMMMMIMIIIIMMMMIMIIMMMIMI 

GLILGLIGQVGVSLDQTRVXQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL 
70 80 90 100 110 120 



orf 103 .pep 
orf 103a 

orf 103 .pep 
orf 103a 



130 140 150 160 170 180 

orf 103 .pep NP I LNRLLP I KS I P ACLAVG I LWGWLPCGLVYS AS LYALGSGS AATGGLYMLAFALGTLP 

MMMMMIMIMM I I I Mill IMIIIIIIMMMIIMMMIIIM 

orf 103a NP I LNRLLP I KS I PACLAVGI LWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP 

130 140 150 160 170 180 



190 200 210 220 

orf 103 .pep NLLAIGIFSLQLXKIMQNRYIRLCTGLSVSLWALWKLAVLWLX 

II I I IMIMI I MIMM Ml III IMIIMI 

orf 103a NLXAI G I FSLQLXKIMQNRY I RLCTGLS VS LWALWKLAVLWLX 

190 200 210 220 



The complete length PRF103a nucleotide sequence [<SEQ ID 395>] (SEPIDNP: 395) is: 



1 ATGAACCANG ACATCACTTT CCTCACCCTG TTCCTACTCG GTTTCTTCGG 

51 CGGAACGCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GCGTTTGCGC 

101 TCCAACTCCC CCCGCATATC AACCGCTTNT GGCTGATCCT GCTGCTTAAC 

151 ACAGGACGGG TAAGCAGCTA TACGGCAATC GGCCTGATAC TCGGATTAAT 

201 CGGACAGGTC GGCGTTTCAC TCGACCAAAC CCGCGTCNTG CAGAATATTT 

251 TATACACGGC CGCCAACCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC 

3 01 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAAATCGGCA AACCGATATG 
351 GCGGAACCTG AACCCGATAC TCAACCGGCT GTTACCCATA AAATCCATAC 

4 01 CCGCCTGCCT TGCGGTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTA 
451 GTTTACAGCG CGTCGCTTTA CGCGCTGGGA AGCGGTAGTG CGGCAACGGG 
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501 CGGGTTATAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTTNGG 
551 CAATCGGCAT TTTTTCCCTG CAACTGNAAA AAATCATGCA AAACCGATAT 
601 ATCCGCCTGT GTACGGGATT ATCCGTATCA TTATGGGCAT TATGGAAACT 
651 TGCCGTCCTG TGGCTGTAA 

5 

This encodes a protein having amino acid sequence [<SEQ ID 396>] (SEP ID NO: 396) : 

1 MNXDITFLTL FLLGFFGGTH CIGMCGGLSS AFALQLPPHI NRXWLILLLN 

51 TGRVSSY TAI GLILGLIGQV GVSL DQTRVX QNILYTAAN L LLLFLGLYLS 

101 GISSLAAKIE KIGKPIWRNL NPILNRLLPI KSIP ACLAVG ILWGWLPCGL 

10 151 VYSASLYALG SGSAATGGLY M LAFALGTLP NLXAIGIF SL QLXKIMQNRY 

2 01 IRLCTGLSVS LWALWKLAVL WL* 

ORF103a (SEP ID NO: 396) and PRF103-1 (SEP ID NO: 394) show 97.7% identity in 222 aa 
overlap: 

15 10 20 30 40 50 60 

orf 103a . pep MNXDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRXWLILLLNTGRVSSYTAI 

II M 1 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 ! 1 1 1 M I II M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 

orf 103-1 MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRVSSYTAI 

10 20 30 40 50 60 

20 70 80 90 100 110 120 

orf 103a. pep GL I LGL IGQVGVS LDQTRVXQN I LYTAANLLLLFLGL YLSG I S SLAAKI EKI GKP I WRNL 

II MM lllllll MINI IMIIII Mil Mil IIMIII IIIMIIIIMIIM II 

orfl03-l GL I LGL IGQVGVS LDQTRVLQN I LYTAANLLLLFLGL YLSG I S S LAAK I EKI GKP I WRNL 

70 80 90 100 110 120 

25 130 140 , 150 160 170 180 

orf 103a . pep NP I LNRLLP I KSIPACLAVG I LWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP 

I I I I I i I I I I I ] I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
orf 103-1 NP I LNRLLP IKS I PACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP 

130 140 150 160 170 180 

30 190 200 210 220 

orf 103a. pep NLXAIGIFSLQLXKIMQNRYIRLCTGLSVSLWALWKLAVLWLX 

II MINIMI I II 1 1 II II I M II 1 1 II 1 1 1 1 1 M M II 

orf 103 - 1 NLLAIGIFSLQLKKIMQNRYIRLCTGLSVSLWALWKLAVLWLX 

190 200 210 220 

35 Homology with a predicted PRF from N gonorrhoeae 

PRF103 (SEP ID NP: 392) shows 95.5% identity over a 222aa overlap with a predicted PRF 
(PRF103.ng) (SEP ID NP: 398) from N. gonorrhoeae: 

orf 103 .pep MNHDITFLTLFLLGXFGGTHCIGMCGGLSSAFXXQLPPHINRFWLILLLNTGRVSSYTAI 60 

MMMMIMM llllllllll llllll MM MIMMMMIMMIIMI 

40 orf 103ng MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRISSYTAI 60 

orf 103 .pep GLI LGL IGQVGVSLDQTRVLQNI LYTAANLLLLFLGLYLSGI SSLAAKIEKIGKP I WRNL 120 

M 1 1 1 II MMM I II M 1 1 1 1 1 M Ml Ml 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 M M 1 1 1 1 M 

orf 103ng GLMLGLIGQLG I S LDQTRVLQN I LYTASNLLLLFLGL YLSG I SSLAAKIEKIGKP I WRNL 120 
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orf 103 .pep NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP 180 

1 1 1 1 1 1 Ml II M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 ! 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 II 1 1 1 i 1 1 

or f 1 0 3 ng NPI LNRLLP IKS I PACLAVG I LWGWLPCGLVYS AS LYALGSGS ATTGGL YMLAFALGTLP 180 

orf 103 .pep NLLAI GI FSLQLXKI MQNRY I RLCTGLS VSLWALWKLAVLWL 222 

Illlllllllll IIIIIIIIIIIIIIIMIIIIIMIIMI 

or f 1 0 3 ng NLLAIG I FSLQLKKIMQNRY I RLCTGLS VSLWALWKLAVLWL 222 

The complete length ORF103ng nucleotide sequence [<SEQ ID 397>] (SEP ID NO: 397) is: 



1 ATGAACCACG ACATCACTTT CCTCACCCTG TTCCTGCTCG GTTTCTTCGG 

51 CGGAACTCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GCGTTTGCGC 

101 TCCAACTCCC CCCGCATATC AACCGCTTTT GGCTGATTCT GCTGCTTAAC 

151 ACAGGACGGA TAAGCAGCTA TACGGCAATC GGCCTGATGC TCGGATTAAT 

2 01 CGGACAACTC GGCATTTCAC TCGACCAAAc ccgcgTCCTG CAAAATATTT 
251 tatacacagc ctccaaCCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC 
301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAAATCGGCA AACCGATATG 

3 51 GCGCAACCTG AACCCGATAC TCAACCGGCT GCTGCCCATA AAATCCATAC 
401 CCGCCTGCCT TGCTGTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTG 

4 51 GTTTACAGCG CATCACTTTA CGCGCTGGGA AGCGGTAGTG CGACAACCGG 
501 CGGACTGTAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTTTGG 
551 CAATCGGCAT TTTTTCCCTG CAACTGAAAA AAATCATGCA AAACCGATAT 
601 ATCCGCCTGT GTACAGGATT ATCCGTATCA TTATGGGCAT TATGGAAGCT 
651 TGCCGTCCTG TGGCTGTAA 

This encodes a protein having amino acid sequence [<SEQ ED 398>] (SEP ID NO: 398) : 



1 MNHDITFLTL FLLGFFGGTH CIGMCGGLSS AFALQLPPHI NRFWLILLLN 

51 TGRISSY TAI GLMLGLIGQL GISL DQTRVL QNILYTASN L LLLFLGLYLS 

101 GISSLA AKIE KIGKPIWRNL NPI LNRLLP I KS I P ACLAVG ILWGWLPCGL 

151 VYSASLYALG SGS ATTGGL Y M LAFALGTLP NLLAIGIF SL QLKKIMQNRY 

2 01 I RLCTGLS VS LWALWKLAVL WL* 

In addition, ORF103ng (SEP ID NO: 398) and ORF103-1 (SEP ID NO: 394) show 97.3% identity 
in 222 aa overlap: 



10 20 30 40 50 60 

orf 103 - 1 . pep MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRVSSYTAI 

IMIIIIIMI IMIIIIIIIIIIIIIIIII I IIIIMIIIIIIIIIIIhlll-l 
orf 103ng MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRISSYTAI 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 103-1 .pep GLILGLIGQVGVSLDQTRVLQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL 

|: I I I I I I : I :| I I I I I I I I I I I I I I : I I I I I I. M II I I I M I I I I M I I I I I I I I 
orf 103ng GLMLGLIGQLGISLDQTRVLQNILYTASNLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL 

70 80 90 100 110 120 



130 140 150 160 170 180 

orf 103 - 1 . pep NP I LNRLLP I KS I PACLAVG I LWGWLPCGLVYS AS LYALGSGS AATGGL YMLAFALGTLP 

1 1 1 II 1 1 1 ,1 1 1 1 1 1 1 1 1 i 1 1 1 1 M 1 1 1 ! I I II 1 1 II 1 1 1 h 1 1 1 1 1 1 1 1 1 1 : h I 

orfl03ng NPI LNRLLP I KS I PACLAVG I LWGWLPCGLVYS AS LYALGSGS ATTGGL YMLAFALGTLP 

130 140 150 160 170 180 



orf 103-1 .pep 



190 200 210 220 

NLLAI GI FSLQLKKIMQNRY I RLCTGLS VSLWALWKLAVLWLX 
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orf 103ng 




190 200 210 220 



Based on this analysis, including the presence of a putative leader sequence (double-underlined) 
and several putative transmembrane domains (single-underlined) in the gonococcal protein, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ED 399>] (SEP ID 



1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTT CGCTTGGCAC TTTTGGCGGC 

51 GATGACGTGG GGAACGCTGC . CGAT . TCCGT GCGGCAGGTA TTGAAGTTTG 

101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA 

151 TTGTTTGTTT TGCTGGCACT GGGCGGGCGG CTGCcGAAGC GGCG^GGATT 

2 01 TTTCTTGGTG CTCATTCAGG CTGCTGCTGC TCGGCGTGGC GGGCATTTCG 
251 GCAAACTTTG TGCTGATTGC CCAAGGGCTG CATTATATTT CGCCGACCAC 

3 01 GACGCAGGTT TTGTGGCAGA TTTCGCCGTT TACGATGATT GTwGTCGGTG 
3 51 TGTTGGTGTT TAAAGACCGG ATGACTGCCG CTCAGAAAAT CGGCTTGGTT 
401 TTGCTGCTTG CCGGTTTGCT TATGTATTTT AACGATAAAT TCGGCGAGTT 
451 GTCGGGTTTG GGCGCGTATG C.AAGGGCGT GTTGCTGTGT GCGGCAGGCA 
501 GTATGGCATG GGTGTGTAAT GCCGTGGCGC AAAAGCTGCT GTCGGCGCAA 
551 TTCGGGCCGC AACAGATTCT GCTGTTGATT TATGCGGCAA GTGCCGCCGT 
601 GTTCCTGCCG TTTGCCGAAC CGGCACACAT CGGAAGTATG GACGGTACGT 
651 TGGCGTGGGT ATGTATTGCG TATTGCTGCT TGAATACGTT AATCGGTTAC 
701 GGCTCGTTCG GCGAGGCGTT GAAACATTGG GAGGCTTCCA AAGTCAGCGC 
751 GGTAACAACC TTGCTCCCCG TGTTTACCGT AATAAATACT TTGCTCGGGC 
801 ATTATGTGAT GCCTGAAACT TTTGCCGCGC CGGA. . 



This corresponds to the amino acid sequence [<SEQ ID 400; ORF104>] fSEO ID NO: 400; 



1 MENQRPLLGF RLALLAAMTW GTLPXSVRQV LKFVDAPTLV WVRFTVAAAV 

51 LFVLLALGGR LPKRRDFSWC SFRLLLLGVA GISANFVLIA QGLHYISPTT 

101 TQVLWQISPF TMIWGVLVF KDRMTAAQKI GLVLLLAGLL MYFNDKFGEL 

151 SGLGAYXKGV LLCAAGSMAW VCNAVAQKLL SAQFGPQQIL LLIYAASAAV 

2 01 FLPFAEPAHI GSMDGTLAWV CIAYCCLNTL IGYGSFGEAL KHWEASKVSA 

2 51 VTTLLPVFTV INTLLGHYVM PETFAAP . . . 



Further work revealed further partial DNA sequence [<SEQ ID 40 1>] (SEP ID NO: 401) : 



1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTC GCGTTGGCAC TTTTGGCGGC 

51 GATGACGTGG GGAACGCTGC CGATTGCCGT GCGGCAGGTA TTGAAGTTTG 

101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA 

151 TTGTTTGTTT TGCTGGCACT GGGCGGGCGG CTGCCGAAGC GGCGGGATTT 

201 TTCTTGGTGC TCATTCAGGC TGCTGCTGCT CGGCGTGGCG GGCATTTCGG 

251 CAAACTTTGT GCTGATTGCC CAAGGGCTGC ATTATATTTC GCCGACCACG 
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3 01 ACGCAGGTTT TGTGGCAGAT TTCGCCGTTT ACGATGATTG TTGTCGGTGT 
351 GTTGGTGTTT AAAGACCGGA TGACTGCCGC TCAGAAAATC GGCTTGGTTT 

4 01 TGCTGCTTGC CGGTTTGCTT ATGTTTTTTA ACGATAAATT CGGCGAGTTG 
4 51 TCGGGTTTGG GCGCGTATGC GAAGGGCGTG TTGCTGTGTG CGGCAGGCAG 

5 501 TATGGCATGG GTGTGTTATG CCGTGGCGCA AAAGCTGCTG TCGGCGCAAT 

551 TCGGGCCGCA ACAGATTCTG CTGTTGATTT ATGCGGCAAG TGCCGCCGTG 

601 TTCCTGCCGT TTGCCGAACC GGCACACATC GGAAGTTTGG ACGGTACGTT 

651 GGCGTGGGTT TGTTTTGCGT ATTGCTGCTT GAATACGTTA ATCGGTTACG 

701 GCTCGTTCGG CGAGGCGTTG AAACATTGGG AGGCTTCCAA AGTCAGCGCG 

10 751 GTAACAACCT TGCTCCCCGT GTTTACCGTA ATAwTwwCTT TGCTCGGGCA 

801 TTATGTGATG CCTGAAACTT TTGCCGCGCC GGA. . . 

This corresponds to the amino acid sequence [<SEQ ID 402; ORF104-1>] (SEO ID NO: 402; 
PRF104-1) : 

15 1 MENQRPLLGF ALALLAAMT W GTLPIAVRQV LKFVDAPT LV WVRFTVAAAV 

51 LFVLLA LGGR LPKRRDFSWC SFR LLLLGVA GISANFVLIA QGLHYISPTT 

101 TQVLWQISPF TMIWGVLVF KDRMTAAQKI GLVLLLAGLL MFFNDKFGEL 

151 SGLGAYAKG V LLCAAGSMAW VCYAVA QKLL SAQFGPQQ IL LLIYAASAAV 

201 FLPFAEPAHI GSLD GTLAWV CFAYCCLNTL I GYGSFGEAL KHWEAS KVSA 

20 251 VTTLLPVFTV IXXL LGHYVM PETFAAP . . . 

Computer analysis of this amino acid sequence gave the following results: 

Homology with hypothetical HI0878 protein (SEP ID NO: 1138) of H. influenzae (accession 
number U32769) 

25 ORF104 (SEP ID NO: 400) and HI0878 (SEP ID NO: 1138) show 40% aa identity in 277aa 
overlap: 



30 



35 



40 



orfl04 4 QRPLLGFRLALLAAMTWGTLPXSVRQVLKFVDAPTLVWXXXXXXXXXXXXXXXXXXXXP- 62 

Q+PLLGF AL+ AM WG+LP +++QVL ++A T+VW P 
HI0878 3 QQPLLGFTFALITAMAWGSLPIALKQVLSVMNAQTIVWYRFIIAAVSLLALLAYKKQLPE 62 

orfl04 63 - - KRRDFS WCS FRLLLLGVAG I SAN FVL I AQGLH Y I S PTTTQVLWQ ISP FTM I WGVLVF 120 

K R ++W ++L+GV G+++NF+L + L+YI P+ Q+ +S F M++ GVL+F 
HI0878 63 LMKVRQYAW IMLIGVIGLTSNFLLFSSSLNYIEPSVAQIFIHLSSFGMLICGVLIF 118 

orfl04 121 KDRMTAAQKIXXXXXXXXXXMYFNDKFGELSGLGAYXKGVLLCAAGSMAWVCNAVAQKLL 180 

K+ + + QKI ++FND+F +GL Y GV+L G++ WV +AQKL+ 

HI 0878 119 KEKLGLHQKIGLFLLLIGLGLFFNDRFDAFAGLNQYSTGVILGVGGALIWVAYGMAQKLM 178 

orfl04 181 S AQFGPQQ I LLL I YAAS AAVFLPFAEPAH IGSMDGTLAWVC I AYCCLNTL I GYGS FGEAL 24 0 

+ F QQILL++Y A F+P A+ + + + LA +C YCCLNTL I GYGS + EAL 
HI0878 179 LRKFNSQQ I LLMMYLGCAI AFMPMADFSQVQELT - PLAL I CF I YCCLNTL I GYGS YAEAL 237 

orfl04 241 KHWEASKVSAVTTLLPVFTVINTLLGHYVMPETFAAP 277 

W+ SKVS V TL+P+FT++ + + HY P FAAP 
HI0878 238 NRWDVSKVSWITLVPLFTILFSHIAHYFSPADFAAP 274 



Homology with a predicted PRF from TV. meningitidis (strain A) 
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ORF104 (SEP ID NO: 400) shows 95.3% identity over a 277aa overlap with an ORF (ORF104a) 
(SEP ED NO: 404) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 104 . pep MENQRPLLGFRLALLAAMTWGTLPXSTOQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR 

5 llllllllll lllllllllllll :||||llllllllll!lllllllllllllllllll 

orf 104 a MENQRPLLGFALALLAAMTWGTLP I ATOQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 104 .pep LPKRRDFSWCS FRLLLLGVAGI S ANFVLI AQGLHY I SPTTTQVLWQ I S PFTMI WGVLVF 

10 Ml || MM I llllllllll I 1 1 1 1 1 I - 1 1 1 ! I t < I - 1 III ; I II llllllllll 

orf 104a LPKWRDFSWCS FRLLLLGVAGI SAN FVLIAQGLHYISPTTTQVLWQISPFTMI WGVLVF 

70 80 90 100 110 120 

130 140 150 160 170 180 

or f 1 04 . pep KDRMTAAQKIGLVLLLAGLLMYFNDKFGELSGLGAYXKGVLLCAAGSMAWVCNAVAQKLL 

15 || | | M I II I I I I I II I I I I I : I I I I I II II I I I I I I I I I I I I I I I I I I I I M M I I I 

or f 1 04a KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSIVIAWVCYAVAQKLL 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 104 . pep SAQFGPQQILLLI YAASAAVFLPFAEPAHIGSMDGTLAWVCIAYCCLNTLIGYGSFGEAL 

20 MM II II MIMI II II II Mill! I M I h II 1 1 1 1 1 h 1 1 1 1 1 lllllllllllll 

orf 104a S AQFGPQQ I LLL I YAAS AAVFLPFAELAH IGS LDGTLA WCFAYCCLNTL I GYGS FGEAL 

190 200 210 220 230 240 

250 260 270 

orf 104 .pep KHWEAS KVSAVTTLLPVFTVINTLLGHYVMPETFAAP 

25 II 1 1 II 1 1 II II 1 1 INI II I =1111111 hi I III 

or f 1 04a KHWEAS KVSAVTTLLPVFTV I FSLLGHYVMPDTFAAPDMNGLGYAGALVWGGAVTAAVG 

250 260 270 280 290 300 

The complete length PRF104a nucleotide sequence [<SEQ ID 403>] (SEP ID NP: 403) is: 

30 1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTC GCGTTGGCAC TTTTGGCGGC 

51 GATGACGTGG GGAACGCTGC CGATTGCCGT GCGGCAGGTA TTGAAGTTTG 

101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA 

151 TTGTTTGTTT TGCTGGCATT GGGCGGGCGG CTGCCGAAGT GGCGGGATTT 

2 01 TTCTTGGTGC TCATTCAGGC TGCTGCTGCT CGGCGTGGCG GGCATTTCGG 
35 2 51 CAAACTTTGT GCTGATTGCC CAAGGGCTGC ATTATATTTG GCCGACCACG 

3 01 ACGCAGGTTT TGTGGCAGAT TTCGCCGTTT ACGATGATTG TTGTCGGTGT 
351 GTTGGTGTTT AAAGACCGGA TGACTGCCGC TCAGAAAATC GGCTTGGTTT 

4 01 TGCTGCTTGC CGGTTTGCTT ATGTTTTTTA ACGATAAATT CGGCGAGTTG 
4 51 TCGGGTTTGG GCGCGTATGC GAAGGGCGTG TTGCTGTGTG CGGCAGGCAG 

40 501 TATGGCATGG GTGTGTTATG CCGTGGCGCA AAAGCTGCTG TCGGCGCAAT 

551 TCGGGCCGCA ACAGATTCTG CTGTTGATTT ATGCGGCAAG TGCCGCCGTG 

601 TTCCTGCCGT TTGCCGAACT GGCACACATC GGAAGTTTGG 'ACGGTACGTT 

651 GGCGTGGGTT TGTTTTGCGT ATTGCTGCTT GAATACGTTA ATCGGTTACG 

701 GCTCGTTCGG CGAGGCGTTG AAACATTGGG AGGCTTCCAA AGTCAGCGCG 

45 751 GTAACAACCT TGCTCCCCGT GTTTACCGTA ATATTTTCTT TGCTCGGGCA 

801 TTATGTGATG CCTGATACTT TTGCCGCGCC GGATATGAAC GGTTTGGGTT 

851 ATGCCGGCGC ACTGGTCGTG GTCGGGGGTG CGGTTACGGC GGCGGTGGGG 

901 GACAGGCTGT TCAAACGCCG CTAG 



50 This encodes a protein having amino acid sequence [<SEQ ID 404>] (SEP ID NP: 404) : 
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1 MENQRPLLGF ALALLAAMTW GTLPIAVRQV LKFVDAPT LV WVRFTVAAAV 
51 LFVLLA LGGR LPKWRDFSWC SFR LLLLGVA GISANFVLIA QGLHYISPTT 
101 TQVLWQISPF TMIWGVLVF KDRMTAAQKI GLVLLLAGLL MFFNDKFGEL 
151 SGLGAYAKG V LLCAAGSMAW VCYAVA QKLL SAQFGPQQ IL LLIYAASAA V 
5 201 FLPFAELAHI GSLD GTLAWV CFAYCCLNTL I GYGSFGEAL KHWEASKVSA 

251 VTTLLPVFTV IFSL LGHYVM PDTFAAPDMN GL GYAGALW VGGAVTAAV G 
301 DRLFKRR* 

ORF104a (SEP ID NO: 404) and ORF104-1 ( SEP ID NO: 402) show 98.2% identity in 277 aa 
10 overlap: 

10 20 30 40 50 60 

orf 104a . pep MENQRPLLGFALALLAAMTWGTLPIAWQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR 

IIIMI llllllirilllllll illlllll IIIIIIIIMIIIIIMIIIIIII 

orf 104 - 1 MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR 
15 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 104a. pep LPKWRDFS WCS FRLLLLGVAG I S ANFVL I AQGLHY I S PTTTQVLWQ ISP FTM I WGVLVF 

III I MINI MM I MM I MINI MINIMI II INI II MINIMI Mill II 

orf 104 - 1 LPKRRDFS WCS FRLLLLGVAG I S ANFVL I AQGLHY I S PTTTQVLWQ ISP FTM I WGVLVF 

20 70 80 90 100 110 120 

130 140 150 160 170 180 

orf 104a .pep KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL 

I I I I II I I I I M I I I I II I I I I I : I I I i I I I I I M I I I I I I I I II I I I I I I I I I I I I I 

orf 104 - 1 KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL 
25 130 140 150 160 170 ' 180 

190 200 210 220 230 240 

orf 104a. pep SAQFGPQQ I LLL I YAASAAVFLPFAELAH I GSLDGTLAWVCFAYCCLNTL IGYGSFGEAL 

I I I I I I I I I I I I I I I I I II I I I II I i I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
orf 104 - 1 SAQFGPQQ I LLL I YAAS AAVFLPFAEP AH I GS LDGTLAWVCFAYCCLNTL I GYGS FGEAL 

30 190 200 210 220 230 240 

250 260 270 280 290 300 

orf 104a . pep KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMNGLGYAGALVWGGAVTAAVG 

I II I I I I I I I I I I I I I I I I I I I . I I I I I : I I I I 
orf 104 - 1 KHWEASKVSAVTTLLPVFTVIXXLLGHYVMPETFAAP 
35 250 260 270 

Homology with a predicted ORF from A ^gonorrhoeae 

PRF104 (SEP ID NO: 400) shows 93.9% identity over a 277aa overlap with a predicted PRF 
(PRF104.ng) (SEP ID NP: 406) from N. gonorrhoeae: 

orf 104 .pep MENQRPLLGFRLALLAAMTWGTLPXSVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR 60 

40 | | | || | M I I I I I I I I I I I II I I = I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

orfl04ng MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR 60 

orf 104 .pep LPKRRDFS WCS FRLLLLGVAGISANFVL I AQGLHY IS PTTTQVLWQ I SPFTM I WGVLVF 120 

lllllllll III ll.lhllllllll IIIIMIII II II II I II lllil IIIMI 

orf 104ng LPKRRDFS WHS FRLLLLGVTGISANFVL I AQGLHY IS PTTTQVLWQ I SPFTM I WGVLVF 120 
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orfl04.pep KDRMTAAQKIGLVLLLAGLLMYFNDKFGELSGLGAYXKGVLLCAAGSMAWVCNAVAQKLL 180 

1 1 1 1 4 ] 1 1 1 1 [ 1 1 1 1 1 1 1 1 1 ^ I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 lllllll 

orfl04ng KDRMTAAQKIGLVLLLVGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL 180 

orf 104 .pep SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSMDGTLAWVCIAYCCLNTLIGYGSFGEAL 240 

MIIIIMIIIMMII III II lllllllhlllllll|::|||IMIIIIIIIIIII 
orf 104ng SAQFGPQQILLLIYAASAAVFLLXAEPAHIGSLDGTLAWVCFVYCCLNTLIGYGSFGEAL 240 

orf 104. pep KHWEASKVSAVTTLLPVFTVINTLLGHYVMPETFAAP 277 

iiiiiiiiiiiiiiiiiiiii ni i nihil in 

orf 1 04ng KHWEASKVSAVTTLLPVFTVI FSLLGHYVMPDTFAAPDMNGLGYVGALVWGGAVTAAVG 300 

The complete length ORF104ng nucleotide sequence [<SEQ ID 405>] (SEP ID NO: 405) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 406>] (SEP ID NO: 406) : 



1 MENORPLLGP ALALLAAMTW GTLPIAVRQV LKFVDAPT LV WVRFTVAAAV 

51 LFVLLA LGGR LPKRRDFSWH SFR LLLLGVT GISANFVLIA QGLHYISPTT 

101 TQVLWQISPF TMIWGVLVF KDRMTAAQKI GLVLLLVGLL MFFNDKFGEL 

151 SGLGAYAKG V LLCAAGSMAW VCYAVA QKLL SAQFGPQ QIL LLIYAASAAV 

201 FLLXA EPAHI GSL DGTLAWV CFVYCCLNTL IGYGSFGEAL KHWEAS KVSA 

251 VTTLLPVFTV IFS LLGHYVM PDTFAAPDMN G LGYVGALW VGGAVTAA VG 

301 DRPFKRR* 

Further work revealed the complete gonococcal nucleotide sequence [<SEQ ID 407>] (SEP ID 
NP: 407) : 



1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTC GCGTTGGCAC TTTTGGCGGC 

51 GATGACGTGG GGGACGCTGC CGATTGCCGT GCGGCAGGTA TTGAAGTTTG 

101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA 

151 TTGTTTGTTT TGCTGGCATT GGGCGGGCGG CTGCCGAAGC GGCGGGATTT 

2 01 TTCTTGGCAT TCATTCAGGC TGCTGCTGCT CGGCGTGACG GGCATTTCGG 

2 51 CAAACTTTGT GCTGATTGCC CAAGGGCTGC ATTATATTTC GCCGACCACG 

3 01 ACGCAGGTTT TGTGGCAGAT TTCGCCGTTT ACGATGATTG TTGTCGGCGT 
351 GTTGGTGTTT AAAGACCGGA tgaCTGCCGC GCAGAAAATC GGTTTGGTTT 

4 01 TGCTGCttgT CGGTttgCTT ATGTTTTtta ACGACAAATT CGGCGAGTTG 
4 51 TCGGGTTTGG GCGCGTATGC GAAGGGCGTG TTGCTGTGTG CGGCAGGCAG 
501 TATGGCCTGG GTGTGTTATG CCGTGGCGCA AAAGCTGCTG TCGGCGCAAT 
551 TCGGGCCGCA ACAGATTCTG CTGTTGATTT ATGCGGcaag tgccgccGTG 
601 TTCCtgccgT TTGccgaaCC GGCACACATC GGAAGTTTgg aCGGTACGtt 
651 GGCGTGGGTT TGTTTTGTGT ATTGCTGCTT GAATACGTTA ATCGGTTACG 
701 GCTCGTTCGG CGAGGCGTTG AAACATTGGG AGGCTTCCAA AGTCAGCGCG 
751 GTAACAACCT TGCTCCCCGT GTTTACCGTA ATATTTTCTT TGCTCGGGCA 
8 01 TTATGTGATG CCTGATACTT TTGCCGCGCC GGATATGAAC GGTTTGGGTT 
851 ATGTCGGCGC ACTGGTCGTG GTCGGGGGTG CGGTTACGGC GGCGGTGGGG 
901 GACAGGCCGT TCAAACGCCG CTAG 

This corresponds to the amino acid sequence [<SEQ ID 408; PRF104ng-l>] (SEP ID NP: 408: 
PRF104ng-l) : 



1 MENQRPLLGF ALALLAAMTW GTLPIAVRQV LKFVDAPT LV WVRFTVAAAV 

51 LFVLL ALGGR LPKRRDFSWH SFR LLLLGVT GISANFVLIA QGLHYISPTT 

101 TQVLWQISPF TMIWGVLVF KDRMTAAQKI GLVLLLVGLL MFFNDKFGEL 

151 SGLGAYAKG V LLCAAGSMAW VCYAVA QKLL SAQFGPQQ IL LLIYAASAAV 

201 FLPFA EPAHI GSLD GTLAWV CFVYCCLNTL I GYGSFGEAL KHWEASKVSA 



CHIR-0160 (356.001) PATENT 

-327- 

251 VTTLLPVFTV IFSL LGHYVM PDTFAAPDMN GL GYVGALW VGGAVTAAV G 
301 DRPFKRR* 

ORF104ng-l (SEP ID NO: 408) and ORF104-1 fSEO ID NO: 402) show 97.5% identity in 277 aa 
5 overlap: 

10 20 . 30 40 50 60 

orf 104-1 .pep 
orf 104ng-l 

10 



orf 104-1 .pep 
orf 104ng-l 

15 



orf 104-1 . pep 
orf 104ng-l 

20 



orf 104-1 .pep 
orf 104ng-l 

25 



orf 104-1 . pep 
orf 104ng-l 

30 250 260 270 280 290 300 

In addition, ORF104ng-l (SEP ID NO: 408) shows significant homology with a hypothetical 
H.influenzae protein (SEP ID NP: 1138) : 



gi 1 1573895 (U32769) hypothetical [Haemophilus influenzae] Length = 306 
35 Score = 237 bits (598), Expect = 8e-62 

Identities = 114/280 (40%), Positives = 168/280 (59%), Gaps = 8/280 (2%) 





Query: 


30 


QRPXXXXXXXXXXXMTWGTLPIAVRQVLKFVDAPTLVWXXXXXXXXXXXXXXXXXXXXP- 


88 








Q+P M WG+LPIA++QVL ++A T+VW P 






Sbjct: 


3 


QQPLLGFTFALITAMAWGSLPIALKQVLSVMNAQTIVWYRFI IAAVSLLALLAYKKQLPE 


62 


40 


Query: 


89 


- - KRRDFSWHSFRLLLLGVTGISANFVLIAQGLHYISPTTTQVLWQISPFTMI WGVLVF 


146 








K R ++W ++L+GV G+++NF+L + L+YI P+ Q+ +S F M++ GVL+F 






Sbjct: 


63 


LMKVRQYAW IMLIGVIGLTSNFLLFSSSLNYIEPSVAQIFIHLSSFGMLICGVLIF 


118 



MENQRPLLGFAI^LLAAMTWGTLPIATOQVLKFVDAPTLvWRFTVAAAVLFVLLALGGR 

MM 1 1 1 1[ 1 1 1 ! 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 E 1 1 1 1 

MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWRFTVAAAVLFVLLiALGGR 
10 20 30 40 50 60 

70 80 90 100 110 120 

LPKRRDFSWCS FRLLLLGVAGI SANFVLI AQGLHYI S PTTTQVLWQI SPFTM I WGVLVF 

Illllllll I Mi 1 1 1 M 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h M 1 1 1 1 1 1 1 M H 1 1 II I 

LPKRRDFSWHS FRLLLLGVTGI SANFVLI AQGLHYI S PTTTQVLWQI S PFTM I WGVLVF 
70 80 90 100 110 120 

130 140 150 160 170 180 

KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGS^4AWVCYAVAQKLL 

1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 

KDRMTAAQKIGLVLLLVGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL 
130 140 15.0 160 170 180 

190 200 210 220 230 240 

SAQ FGPQQ ILLLI YAAS AAVFL P FAE PAH I GS LDGTLAWVC FAYCCLNTL I GYGS FGEAL 

I IIIIMIIIIU IIIIIIMIIII IMIIIIIIIM MMMIIMIIIIIMI 

SAQ FG PQQ I LLL I YAASAAVFLP FAE PAH I GS LDGTLAWVC FVYCCLNTL I GYGS FGEAL 
190 . 200 210 220 230 240 

250 260 270 

KHWEASKVSAVTTLLPVFTVIXXLLGHYVMPETFAAP 

IMIIIIIIIIIIIIIIIMI lllllllhlllM 

KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMNGLGYVGALVWGGAVTAAVG 



45 



Query: 147 KDRMTAAQKIXXXXXXXXXXMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL 206 

K+++ QKI +FFND+F +GL Y+ GV+L G++ WV Y +AQKL+ 

Sbjct: 119 KEKLGLHQKIGLFLLLIGLGLFFNDRFDAFAGLNQYSTGVILGVGGALIWVAYGMAQKLM 178 ^ 
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Query: 207 S AQFGPQQ I LLL I YAASAAVFLPFAEP AH I GS LDGTLAWVCFVYCCLNTL I GYGS FGEAL 266 

+F QQILL++Y A F+P A+ + + L LA +CF+YCCLNTLIGYGS+ EAL 
Sbjct: 179 LRKFNSQQILLMMYLGCAIAFMPMADFSQVQELT-PLALICFIYCCLNTLIGYGSYAEAL 237 

Query: 267 KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMN 306 

W+ SKVS V TL+P+FT++FS + HY P FAAP++N 
Sbjct: 238 NRWDVSKVSWITLVPLFTILFSHIAHYFSPADFAAPELN 277 

Based on this analysis, including the presence of a putative leader sequence and several putative 
transmembrane domains in the gonococcal protein, it is predicted that the proteins from 
N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or 
diagnostics, or for raising antibodies. 



Example 48 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 409>] (SEP ID 
NO: 409) : 



1 ATGGTAGCTC GTCGGGCTCA TAACCCGAAG GTCGTAGGTT CGAATCCTGT 

51 .CCCGCAACC TAATTTCAAA CCCCTCGGTT CAATGCCGAG GG . GTTTTGT 

101 T.TTGCCTGT TTCCTGTTTC CTGTTTCCTG CCGCCTCCGT TTTTTGCCGG 

151 ATTTTCCTTC CGGCCGCAAT ATCGGAACGG CAGACCGCCG TCTGTTTGCG 

201 GTTGCAAATT CAGGCAGTTT GGCTACAATC TTCCGCATTG TCTTCAAGAA 

251 AGCCAACCAT GCCGACCGTC CGTTTTACCG AATCCGTCAG CAAACAAGAC 

3 01 CTTGATGCTC TGTTCGAGTG GGCAAAAGCA AGTTACGGTG CAGAAAGTTG 
351 CTGGAAAACG CTGTATCTGA ACGGTCysCC TTTGGGCAAC CTGTCGCCGG 
401 AATGGGTGGA ACGCGTsmmA AAAGACTGGG AGGCAGGCTG CyCGGAGTCT 

4 51 TCAGACGGCA TTTTTCTGAA TgCGGACGGc TGgCctGATA TGGgCGGAcg 
501 cTTACAGCAC CTCGCCCTCG GTTGGCACTG TGCGGGGCTG TTGGACGgsT 
551 GGCGCAACGA GTGTTTCGAC CTGACCGACG GCGGCGGCAA CCCCTTGTTC 
601 ACGCTCGaAc GCGCCGyTTT mCGTCCTkTC GGACTGCTCA GCCGCGCCGT 
651 CCATCTCAAC GGTCTGACCG AATCGGACGG CCGATGGCAT TTCTGGATAG 
701 GCAGGCGCAG TCCGCACAAA GCAGTCGATC CCAACAAACT CGACAATACT 
751 rCCGCCGGCG GTGTTTCCGG CGGCGAAATG CCGTCTGAAG CCGTGTGTCG 
801 CGAAAGCAGC GAAGAAGCCG GTTTGGATAA AACGCTGcTT CCGCTCATCC 
851 GCCCGGTATC GCAGCTGCAC AGCCTGCGCT CCGTCAGCCG GGGTGTACAC 
901 AATGAAATCC TGTATGTATT CGATGCCGTC CTGCCG. . . 



This corresponds to the amino acid sequence [<SEQ ID 410; ORFI05>] (SEP ID NO: 410; 
ORF105) : 



1 MVARRAHNPK WGSNPXPAT XFQTPRFNAE XVLXLPVSCF LFPAASVFCR 

51 IFLPAAISER QTAVCLRLQI QAVWLQSSAL SSRKPTMPTV RFTESVSKQD 

101 LDALFEWAKA SYGAESCWKT LYLNGXPLGN LSPEWVERVX KDWEAGCXES 

151 SDGIFLNADG WPDMGGRLQH LALGWHCAGL LDGWRNECFD LTDGGGNPLF 

201 TLERAXXRPX GLLSRAVHLN GLTESDGRWH FWIGRRS PHK AVDPNKLDNT 

251 XAGGVSGGEM PSEAVCRESS EEAGLDKTLL PLIRPVSQLH SLRSVSRGVH 

301 NE I LYVFDAV LP . . . 



Further work revealed the complete nucleotide sequence [<SEQ ID 41 1>] (SEP ID NO: 411) : 
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1 ATGCCGACCG TCCGTTTTAC CGAATCCGTC AGCAAACAAG ACCTTGATGC 

51 TCTGTTCGAG TGGGCAAAAG CAAGTTACGG TGCAGAAAGT TGCTGGAAAA 

101 CGCTGTATCT GAACGGTCTG CCTTTGGGCA ACCTGTCGCC GGAATGGGTG 

151 GAACGCGTCA AAAAAGACTG GGAGGCAGGC TGCTCGGAGT CTTCAGACGG 

201 CATTTTTCTG AATGCGGACG GCTGGCCTGA TATGGGCGGA CGCTTACAGC 

251 ACCTCGCCCT CGGTTGGCAC TGTGCGGGGC TGTTGGACGG CTGGCGCAAC 

301 GAGTGTTTCG ACCTGACCGA CGGCGGCGGC AACCCCTTGT TCACGCTCGA 

351 ACGCGCCGCT TTCCGTCCTT TCGGACTGCT CAGCCGCGCC GTCCATCTCA 

4 01 ACGGTCTGAC CGAATCGGAC GGCCGATGGC ATTTCTGGAT AGGCAGGCGC 

451 AGTCCGCACA AAGCAGTCGA TCCCAACAAA CTCGACAATA CTGCCGCCGG 

501 CGGTGTTTCC GGCGGCGAAA TGCCGTCTGA AGCCGTGTGT CGCGAAAGCA 

551 GCGAAGAAGC CGGTTTGGAT AAAACGCTGC TTCCGCTCAT. CCGCCCGGTA 

601 TCGCAGCTGC ACAGCCTGCG CTCCGTCAGC CGGGGTGTAC ACAATGAAAT 

651 CCTGTATGTA TTCGATGCCG TCCTGCCCGA AACCTTCCTG CCTGAAAATC 

701 AGGATGGCGA AGTGGCGGGT TTTGAGAAAA TGGACATCGG CGGTCTGTTG 

751 GATGCCATGT TGTCGGGAAA CATGATGCAC GACGCGCAAC TGGTTACGCT 

801 GGACGCGTTT TGCCGTTACG GTCTGATTGA TGCCGCCCAT CCGCTGTCCG 

851 AGTGGCTGGA CGGCATACGT TTATAG 

This corresponds to the amino acid sequence [<SEQ ID 412; ORF105-1>] (SEP ID NO: 412; 
ORF105-1) : 



1 MPTVRFTESV SKQDLDALFE WAKASYGAES CWKTLYLNGL PLGNLSPEWV 

51 ERVKKDWEAG CSESSDGIFL NADGWPDMGG RLQHLALGWH CAGLLDGWRN 

101 ECFDLTDGGG NPLFTLERAA FRPFGLLSRA VHLNGLTESD GRWHFWIGRR 

151 SPHKAVDPNK LDNTAAGGVS GGEMPSEAVC RESSEEAGLD KTLLPLIRPV 

201 SQLHSLRSVS RGVHNEILYV FDAVLPETFL PENQDGEVAG FEKMDIGGLL 

251 DAMLSGNMMH DAQLVTLDAF CRYGLIDAAH PLSEWLDGIR L* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF105 (SEP ID NO: 410) shows 89.4% identity over a 226aa overlap with an ORF (ORF105a) 
(SEP ID NP: 414) from strain A of N. meningitidis: 

60 70 80 90 100 110 

orf 105 . pep ISERQTAVCLRLQIQAVWLQSSALSSRKPTMPTVRFTESVSKQDLDALFEWAKASYGAES 

lllllllllllhlllllllllllllllll 

orf 105a MPTVRFTESVSKHDLDALFEWAKASYGAES 

10 20 30 

120 130 140 150 160 170 

orf 105 pep CWKTLYLNGXPLGNLS PEWVERVXKDWEAGCXES SDG I FLNADGWPDMGGRLQHLALGWH 

IIIIMIII llllllllhlll lllllll IIIIIMMIIIIIIII Mill 

or f 1 0 5 a CWKTL YLNGLPLGNLS PEWAERVKKDWEAGCS ES SDG I FLNADGWPDMGRRLQHLAR I WK 

40 50 60 70 80 90 

180 190 200 210 220 230 

orf 105 . pep CAGLLDGWRNECFDLTDGGGNPLFTLERAXXRPXGLLSRAVHLNGLTESDGRWHFWIGRR 

I I I I II M I I I I I I I I M I II M I I I I II MIMIMI I Ml IIIIMIII 
orf 105a EAGLLHGWRDECFDLTDGGSNPLFALERAAFRPFGLLSRAVHLNGLVESDGRWHFWIGRR 

100 110 120 130 140 150 
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240 250 260 270 280 290 

orf 105 . pep SPHKAVDPNKLDNTXAGGVSGGEMPSEAVCRESSEEAGLDKTLLPLIRPVSQLHSLRSVS 

MINIMUM M MM M M M M M M M M M M M M M M MM M II 

orf 105a SPHKAVDPDKLDNTAAGGVSSGELPSETVCRESSEEAGLDKTLLPLIRPVSQLHSLRPVS 
5 160 170 180 190 200 210 

300 310 
orf 105. pep RGVHNE I L YVFDAVL P 

. MMMMMMMM 

orf 105a RGVHNEILYVFDAVLPETFLPENQDGEVAGFEKMDIGGLLAAMLSGNMMHDAQLVTLDAF 
10 220 230 240 250 260 270 

The complete length ORF105a nucleotide sequence [<SEQ ID 41 3>] (SEP ID NO: 413) is: 

1 ATGCCGACCG TCCGTTTTAC CGAATCCGTC AGCAAACACG ACCTTGATGC 

51 CCTATTCGAG TGGGCAAAGG CAAGTTACGG TGCGGAAAGT TGCTGGAAAA 

15 101 CGCTGTATCT GAACGGTCTG CCTTTGGGCA ATCTGTCGCC GGAATGGGCG 

151 GAGCGCGTCA AAAAAGACTG GGAGGCAGGC TGCTCGGAGT CTTCAGACGG 

2 01 CATTTTCCTG AATGCGGACG GCTGGCCAGA TATGGGCAGA CGCTTGCAGC 
251 ACCTCGCCCG AATATGGAAA GAAGCGGGAC TGCTTCACGG CTGGCGCGAC 

3 01 GAGTGTTTCG ACCTGACCGA CGGCGGCAGC AATCCCTTGT TCGCGCTCGA 
20 351 ACGCGCCGCT TTCCGTCCGT TCGGACTGCT CAGCCGCGCC GTCCATCTCA 

4 01 ACGGTTTGGT CGAATCGGAC GGCCGATGGC ATTTCTGGAT AGGCAGGCGC 
4 51 AGTCCGCACA AAGCAGTCGA TCCCGACAAA CTCGACAATA CTGCCGCCGG 
501 CGGTGTTTCC AGCGGTGAAT TGCCGTCTGA AACCGTGTGT CGCGAAAGCA 
551 GCGAAGAAGC CGGTTTGGAT AAAACGCTGC TTCCGCTCAT CCGCCCGGTA 

25 601 TCGCAGCTGC ACAGCCTGCG CCCCGTCAGC CGGGGTGTGC ACAATGAAAT 

651 CCTGTATGTA TTCGATGCCG TCCTGCCCGA AACCTTCCTG CCTGAAAATC 

701 AGGATGGCGA AGTGGCGGGT TTTGAGAAAA TGGACATCGG CGGTCTGTTG 

751 GCTGCCATGT TGTCGGGAAA CATGATGCAC GACGCGCAAC TGGTTACGCT 

801 GGACGCGTTT TGCCGTTACG GTCTGATTGA TGCCGCCCAT CCGCTGTCCG 

30 851 AGTGGCTGGA CGGCATACGT TTATAG 

This encodes a protein having amino acid sequence [<SEQ ID 414>] (SEP ID NO: 414) : 

1 MPTVRFTESV SKHDLDALFE WAKASYGAES CWKTLYLNGL PLGNLSPEWA 

51 ERVKKDWEAG CSESSDGIFL NADGWPDMGR RLQHLARIWK EAGLLHGWRD 

35 101 ECFDLTDGGS NPLFALERAA FRPFGLLSRA VHLNGLVESD GRWHFWIGRR 

151 SPHKAVDPDK LDNTAAGGVS SGELPSETVC RESSEEAGLD KTLLPLIRPV 

201 SQLHSLRPVS RGVHNE I L YV FDAVLPETFL PENQDGEVAG FEKMDIGGLL 

2 51 AAMLSGNMMH DAQLVTLDAF CRYGLIDAAH PLSEWLDGIR L* 

40 PRF105a (SEP ID NP: 414) and PRF105-1 (SEP ID NP: 412) show 93.8% identity in 291 aa 
overlap: 

10 20 30 40 50 60 

orf 105a . pep MPTVRFTESVSKHDLDALFEWAKASYGAESCWKTLYLNGLPLGNLSPEWAERVKKDWEAG 

MMMMM MMMMMMMM MMI MMMM MMMMMMMM M 

45 orf 105 - 1 MPTVRFTESVSKQDLDALFEWAKASYGAESCWKTLYLNGLPLGNLSPEWVERVKKDWEAG 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 105a . pep CSESSDGIFLNADGWPDMGRRLQHLARIWKEAGLLHGWRDECFDLTDGGSNPLFALERAA 

MMMMMMMM I II I II I = 1 1 1 1 1 1 1 = 1 1 1 1 1 1 1 1 1 = 1 1 1 1 = 1 1 1 1 1 

50 or f 1 0 5 - 1 CSESSDGI FLNADGWPDMGGRLQHLALGWHCAGLLDGWRNECFDLTDGGGNPLFTLERAA 

70 80 90 100 110 120 
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130 140 150 160 170 180 

orf 105a . pep FRPFGLLSRAVHLNGLVESDGRWHFWIGRRS PHKAVDPDKLDNTAAGGVSSGELPSETVC 
II II I I I I II II I I Ml II M I I II I I I M I I I I I Ml I I I II I I II M II M I Ml I 
orf 105-1 FRPFGLLSRAVHLNGLTESDGRWHFWIGRRS PHKAVDPNKLDNTAAGGVSGGEMPSEAVC 

5 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 105a . pep RESSEEAGLDKTLLPLIRPVSQLHSLRPVSRGVHNEILYVFDAVLPETFLPENQDGEVAG 

I 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 

orf 105-1 RESSEEAGLDKTLLPLIRPVSQLHSLRSVSRGVHNEILYVFDAVLPETFLPENQDGEVAG 
10 190 200 210 220 230 240 

250 260 270 280 290 

orf 105a . pep FEKMD I GGLLAAMLSGNMMHDAQLVTLDAFCRYGL I DAAHPLS EWLDGI RLX 
llllllllll I I i I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
orf 105-1 FE KMD I GGLLDAMLSGNMMHDAQLVTLDAFCR YGL I DAAH PLS EWLDGI RLX 

15 250 260 270 280 290 

Homology with a predicted ORF from N. gonorrhoeae 

ORF105 (SEP ID NO: 410) shows 87.5% identity over a 312aa overlap with a predicted ORF 
(ORF105.ng) rSEOIDNO: 416) from N. gonorrhoeae: 



20 



25 



30 



35 



orf 105 .pep 
orf 105ng 
orf 105 .pep 
orf 105ng 



MVARRAHNPKWGSNPXPATXFQTPRFNAEXVLXLPVSCFLFPAASVFCRIFLPAAISER 

I II . I I I I I I I I I I Ml :|||||||| II I I I I I I I I I I I I I I I I I I I I 

MVARRAHNPKWGSNPAPATKYQTPRFNAEGVLF FLFPAAS VFCRI FLPAAI SER 



orf 105 . pep LDGWRNECFDLTDGGGNPLFTLERAXXRPXGLLSRAVHLNGLTESDGRWHFWIGRRSPHK 

I I llllllllll llllllllll II III I I I I I I M : I I : I I I I I I I I I I I I I 
orf 105ng LHGWRNECFDLTDGGGNPLFTLERAAFRPFGLLIRAVHLNGLVESNGRWHFWIGRRSPHK 

orf 105 . pep AVDPNKLDNTXAGGVSGGEMPSEAVCRESSEEAGLDKTLLPLIRPVSQLHSLRSVSRGVH 

Mlhllll MMMMIMIMM MMIMIMM MMIMIMM MUM 
orf 105ng AVDPGKLDNIAGGGVSGGEMPSEAVCRESSEEAGLDKTLFPLIRPVSRLHSLRPVSRGVH 



60 



55 



120 



QTAVCLRLQIQAVWLQSSALSSRKPTMPTVRFTESVSKQDLDALFEWAKASYGAESCWKT 

I = I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i IMMM IMIIIMM MM llllllllll 

QAAVCLRLQIQAVWLQSSALCSRKPAMPTVRFTESVSKQDLDALFERAKAS YGAESCWKT 115 



orf 105 . pep LYLNGXPLGNLSPEWVERVXKDWEAGCXESSDGIFLNADGWPDMGGRLQHLALGWHCAGL 180 

1 1 1 1 IMIMMMM IIIMI MIMIIIIIII llllllllll I IN 

or f 1 0 5ng LYLNRLPLGNLS PEWAER I KKDWEAGCSESSNGI FLNADGWPDMGGRLQHLARTWNKAGL 175 



240 



235 



300 



295 



312 



NE I LYVFDAVLP 

MIMIIIIIII 

NEILYVFDAVLPETFLPENQDGEVAGFEKMDIGGLLDAMLSKNlVnyiHDAQLVTLDAFYRYG 3 55 



orf 105 .pep 
orf 105ng 

A complete length ORF105ng nucleotide sequence [<SEQ ID 41 5>] (SEP ID NO: 415) was 
predicted to encode a protein having amino acid sequence [<SEQ ID 416>] (SEP ID NO: 416) : 



40 



1 MVARRAHNPK WGSNPAPAT KYQTPRFNAE G VLFFLFPAA SVFCRIFL PA 

51 AISERQAAVC LRLQIQAVWL QSSALCSRKP AMPTVRFTES VSKQDLDALF 

101 ERAKASYGAE SCWKTLYLNR LPLGNLSPEW AERIKKDWEA GCSESSNGIF 

151 LNADGWPDMG GRLQHLARTW NKAGLLHGWR NECFDLTDGG GNPLFTLERA 
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2 01 AFRPFGLLIR AVHLNGLVES NGRWHFWIGR RSPHKAVDPG KLDNIAGGGV 
251 SGGEMPSEAV CRESSEEAGL DKTLFPLIRP VSRLHSLRPV SRGVHNEILY 

3 01 VFDAVLPETF LPENQDGEVA GFEKMDIGGL LDAMLSKNMM HDAQLVTLDA 
3 51 FYRYGLIDAA HPLSEWLDGI RL* 



Further work revealed the complete nucleotide sequence [<SEQ ID 417>] (SEP ID NO: 417) : 



1 ATGCCGACCG TCCGTTTTAC CGAATCCGTC AGCAAACAAG ACCTTGATGC 

51 CCTGTTCGAG CGGGCAAAAG CAAGTTACGG TGCCGAAAGT TGCTGGAAAA 

101 CGCTGTATCT GAACCGTCTT CCTTTGGGCA ATCTGTCGCC GGAATGGGCT 

151 GAGCGCATCA AAAAAGACTG GGAGGCAGGC TGCTCCGAGT CTTCAGACGG 

201 CATTTTTCTG AATGCGGACG GCTGGCCGGA TATGGGCGGA CGCTTGCAGC 

251 ACCTCGCCCG CACATGGAAC AAGGCGGGGC TGCTTCACGG ATGGCGCAAC 

301 GAGTGTTTCG ACCTGACCGA CGGCGGCGGC AACCCCTTGT TCACGCTCGA 

3 51 ACGCGCCGCT TTCCGTCCGT TCGGACTACT CAGCCGCGCC GTCCATCTCA 

4 01 ACGGTTTGGT CGAATCGAAC GGCAGATGGC ATTTTTGGAT AGGCAGGCGC 
451 AGTCCGCACA AAGCAGTCGa tcCCGGCAAG CTCGACAATA TTGCCGGCGG 
501 CGGTGTTTCC GGCGGCGAAA TGCCGTCTGA AGCCGTGTGC CGCGAAAGCA 
551 GCGAAGAAGC CGGTTTGGAT AAAACGCTGT TTCCGCTCAT CCGCCCAGTA 
601 TCGCGGCTGC ACAGCCTTCG CCCCGTCAGC CGAGGTGTGC ACAATGAAAT 
651 CCTGTATGTG TTCGATGCCG TCCTGCCCGA AACCTTCCTG CCTGAAAATC 
701 AGGATGGCGA GGTAGCGGGT TTTGAAAAGA TGGACATTGG CGGCCTATTG 
751 GATGCCATGT TGTCGAAAAA CATGATGCAC GACGCGCAAC TGGTTACGCT 
801 GGACGCGTTT TACCGTTACG GTCTGATTGA TGCCGCCCAT CCGCTGTCCG 
851 AGTGGCTGGA CGGCATACGT TTATAG 

This corresponds to the amino acid sequence [<SEQ ID 418; ORF105ng-l>] (SEP ID NO: 418; 
ORF105ng-l) : 



1 MPTVRFTESV SKQDLDALFE RAKASYGAES CWKTLYLNRL PLGNLSPEWA 

51 ER I KKDWEAG CSESSDGIFL NADGWPDMGG RLQHLARTWN KAGLLHGWRN 

101 ECFDLTDGGG NPLFTLERAA FRPFGLLSRA VHLNGLVESN GRWHFWIGRR 

151 SPHKAVDPGK LDNIAGGGVS GGEMPSEAVC RESSEEAGLD KTLFPLIRPV 

201 SRLHSLRPVS RGVHNEILYV FDAVLPETFL PENQDGEVAG FEKMDIGGLL 

2 51 DAMLSKNMMH DAQLVTLDAF YRYGLIDAAH PLSEWLDGIR L* 

PRG105ng-l (SEPIDNP: 418) and PRF105-1 (SEP ID NP: 412) show 93.5% identity in 291 aa 
overlap: 



10 . 20 30 40 50 60 

or f 105-1. pep MPTVRFTESVSKQDLDALFEWAKASYGAESCWKTLYLNGLPLGNLS PEWVERVKKDWEAG 

Illlllllllllllllllll IIIIMIIIIIIIIIII lllllllllhlhlllllll 
orf 105ng-l MPTWFTESVSKQDLDALFERAKASYGAESCWKTLYLNRLPLGNLS PEWAER I KKDWEAG 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 105-1 .pep CSESSDGIFLNADGWPDMGGRLQHLALGWHCAGLLDGWRNECFDLTDGGGNPLFTLERAA 

III Mill MINIM Mill III II h 1 1 1 I 1 1 1 1 1 1 1 1 1 1 II I 1 1 1 1 1 1 1 1 1 1 I 

or f 10 5ng- 1 CSESSDGI FLNADGWPDMGGRLQHLARTWNKAGLLHGWRNECFDLTDGGGNPLFTLERAA 

70 80 90 100 110 120 



130 140 150 160 170 180 

orf 105- 1 .pep FRPFGLLSRAVHLNGLTESDGRWHFW I GRRSPHKAVDPNKLDNTAAGGVS GGEMPSEAVC 

I I I I I I I I I I I U h I - I I I I I I I I I I I I I I II M I I I hllllllllllllll 
orf 105ng- 1 FRPFGLLSRAVHLNGLVESNGRWHFW I GRRSPHKAVDPGKLDN I AGGGVS GGEMPSEAVC 
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130 140 150 160 170 180 

190 200 210 220 230 240 

orf 105-1. pep RESSEEAGLDKTLLPLIRPVSQLHSLRSVSRGVHNEILYVFDAVLPETFLPENQDGEVAG 

I II I II III Nihil II 1 1 hi III I MM MM MM I II II MINN MM I II I 

5 orf 105ng-l RESSEEAGLDKTLFPLIRPVSRLHSLRPVSRGVHNEILYVFDAVLPETFLPENQDGEVAG 

190 200 210 220 230 240 

250 260 270 . 280 290 

orf 105-1 .pep FEKMD I GGLLDAMLSGNMMHDAQLVTLDAFCRYGL I DAAHPLS EWLDG I RLX 

II I Ml II I II 1 1 II II I II II I II MM III MM III MUM Mill 

10 orf 105ng- 1 FEKMD I GGLLDAMLSKNMMHDAQLVTLDAFYRYGL I DAAHPLS EWLDG I RLX 

250 260 270 280 290 

Furthermore, ORF105ng-l (SEP ID NO: 418) shows homology with a yeast enzyme (SEP ID NO: 
1139) : 



15 sp|P41888 |TNR3_SCHPO THIAMIN PYROPHOSP HO KINASE (TPK) (THIAMIN KINASE) 

) gi | 1076928 |pir | | S52350 thiamin pyrophosphokinase (EC 2.7.6.2) - fission yeast 
(Schizosaccharomyces pombe) )gi|666111 (X84417) thiamin pyrophosphokinase 
[Schizosaccharomyces pombe] ) gi | 2330852 | gnl | PID | e334056 (Z98533) thiamin 
pyrophosphokinase [Schizosaccharomyces pombe] Length = 569 
20 Score = 105 bits (259), Expect = 4e-22 

Identities = 64/192 (33%), Positives = 94/192 (48%), Gaps = 3/192 (1%) 





Query : 


268 


NKAGLLHGWRNECFDLTDGGGNPLFTLERAAFRPFGLLSRAVHLNGLVESNGRW- -HFWI 


441 








N G+ WRNE + + P+ +ER F FG LS VH + + W+ 






Sbjct : 


96 


NT FG I ADQWRNEL YTVYGKS KKP VLAVERGGFWL FGFLS TGVHCTM Y I PAT KEH P LR I WV 


155 


25 


Query : 


442 


GRRS PHKAVDPGKLDNIAGGGVSGGEMPSEAVCRESSEEAGLDKTLFPLIRPVSRLHSLR 


621 








RRSP K P LDN GG+ + G+ + +E SEEA LD + LI P + ++ 






Sbjct : 


156 


PRRSPTKQTWPNYLDNSVAGGIAHGDSVIGTMIKEFSEEANLDVSSMNLI - PCGTVSYIK 


214 




Query : 


622 


PVSRG- VHNEILYVFDAVLPETFLPENQDGEVAGFEKMDIGGLLDAMLSKNMMHDAQLVT 


798 








R + E+ YVFD + + +P DGEVAGF + + +L + K+ + LV 




30 


Sbjct: 


215 


MEKRHWIQPELQYVFDLPVDDLVIPRINDGEVAGFSLLPLNQVLHELELKSFKPNCALVL 


274 




Query : 


799 


LDAF YR YGL I DAAH P 843 










LD R+G+I HP 






Sbjct : 


275 


LDFLIRHGIITPQHP 289 





35 Based on this analysis, including the presence of a putative transmembrane domain in the 
gonococcal protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 49 

The following DNA sequence, believed to be complete, was identified in N. meningitidis [<SEQ ID 
40 4 1 9>] (SEP ID NO: 419) : 



1 ATGAATAGAC CCAAGCAACC CTTCTTCCGT CCCGAAGTCG CCGTTGCCCG 
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51 CCAAACCAGC CTGACGGGTA AAGTGATTCT GACACGACCG TTGTCATTTT 

101 CCCTATGGAC GACATTTGCA TCGATATCTG CGTTATTGAT TATCCTGTTT 

151 TTGATATTTG GTAACTATAC GCGAAAGACA ACAGTGGAGG GACAAATTTT 

2 01 ACCTGCATCG GGCGTAATCA GGGTGTATGC ACCGgATACG rGkACAATTA 
5 2 51 CAGCGAAATT CGTGGAAGAT GGmsAAAAGG TTAAGGCTGG CGACAAGCTA 

3 01 TTTGCGCTTT CGACCTCACG TTTCGGCGCA GGAGGTAGCG TGCAGCAGCA 
351 GTTGAAAACG GAGGCAGTTT TGAAGAAAAC GTTGGCAGAA CAGGAACTGG 

4 01 GTCGTCTGAA GCTGATACAC GGGAATGAAA CGCGCAgCcT TAAAGCAACT 
4 51 GTCGAACGTT TGGAAAACCA GGAACTCCAT ATTTCGCAAC AGATAGACGG 

10 501 TCAGAAAAGG . CGCATTAGAC TTGCGGAAGA AATGTTGCAG AAATATCGTT 

551 TCCTATCCGC . CAATGA 

This corresponds to the amino acid sequence [<SEQ ID 420; ORF107>] (SEP ID NO: 420; 
PRF107) : 

15 1 MNRPKQPFFR PEVAVARQTS LTGKVILTRP LSFSLWTTFA SISALLIILF 

51 LIFGNYTRKT TVEGQILPAS GVIRVYAPDT XTITAKFVED GXKVKAGDKL 
101 FALSTSRFGA GGSVQQQLKT EAVLKKTLAE QELGRLKLIH GNETRSLKAT 
151 VERLENQELH ISQQIDGQKR RIRLAEEMLQ KYRFLSXQ* 

20 Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N.meningitidis (strain A) 

ORF107 (SEP ID NO: 420) shows 97.8% identity over a 186aa overlap with an ORF (ORF107a) 
(SEP ID NO: 422) from strain A of N. meningitidis: 

10 20 30 40 50 60 

25 orf 107. pep MNRPKQPFFRPEVAVARQTSLTGKVILTRPLS FSLWTTFAS I S ALLI ILFL I FGNYTRKT 

1 1 1 1 1 1 1 1 1 ! 1 1 1 i M I II 1 1 i 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M I II 1 1 1 1 1 M 1 1 1 1 

orf 107a MNRPKQPFFRPEVAVARQTSLTGKVILTRPLS FSLWTTFAS IS ALL I ILFL I FGNYTRKT 

10 20 30 40 50 60 

70 80 90 100 110 - 120 

30 orf 107 . pep TVEGQILPASGVIRVYAPDTXTITAKFVEDGXKVKAGDKLFALSTSRFGAGGSVQQQLKT 

IIIIIIIIIIIIIIIIIIII Mllll Ml IIIIIIIIIIIIIIIIIII IIIIIIII 
orf 107a TVEGQ I LPASGVI RVYAPDTGT I TAKFXEDGEKVKAGDKLFALSTSRFGAGDS VQQQLKT 

70 80 90 100 110 120 

130 140 150 160 170 180 

35 orf 107 . pep EAVLKKTLAEQELGRLKLIHGNETRSLKATVERLENQELHISQQIDGQKRRIRLAEEMLQ 

I I I I I I I I I I M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I ll I I I I I I I 
orf 107a EAVLKKTLAEQELGRLKLIHGNETRSLKATVERLENQELHISQQIDGQKRRIRLAEEMLQ 

130 140 150 160 170 180 

189 

40 orf 107. pep KYRFLSXQX 

Mllll 

orf 107a KYRFLSANDAVPKQEMMNVKAELLEQKAKLDAYRREEVGLLQE I RTQNLTLXSLPQAAX 

190 200 210 220 230 



45 



The complete length PRF107a nucleotide sequence [<SEQ ID 42 1>] (SEPIDNP: 421) is: 
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1 ATGAATAGAC CCAAGCAACC NTTCTTCCGT CCCGAAGTCG CCGTTGCCCG 

51 CCAAACCAGC CTGACGGGTA AAGTGATTCT GACACGACCG TTGTCATTTT 

101 CCCTATGGAC GACATTTGCA TCGATATCTG CGTTATTGAT TATCCTGTTT 

151 TTGATATTTG GTAACTATAC GCGAAAGACA ACAGTGGAGG GACAAATTTT 

201 ACCTGCATCG GGCGTAATCA GGGTGTATGC ACCGGATACG GGGACAATTA 

251 CNGCGAAATT CNTGGAAGAT GGAGAAAAGG TTAAGGCTGG CGACAAGCTA 

301 TTTGCGCTTT CGACCTCACG TTTCGGCGCA GGAGATAGCG TGCAGCAGCA 

351 GTTGAAAACG GAGGCAGTTT TGAAGAAAAC GTTGGCAGAA CAGGAACTGG 

4 01 GTCGTCTGAA GCTGATACAC GGGAATGAAA CGCGCAGCCT TAAAGCAACT 

4 51 GTCGAACGTT TGGAAAACCA GGAACTCCAT ATTTCGCAAC AGATAGACGG 

501 TCAGAAAAGG CGCATTAGAC TTGCGGAAGA AATGTTGCAG AAATATCGTT 

551 TCCTATCCGC CAATGATGCA GTGCCAAAAC AAGAAATGAT GAATGTCAAG 

601 GCAGAGCTTT TAGAGCAGAA AGCCAAACTT GATGCCTACC GCCGAGAAGA 

651 AGTCGGGCTG CTTCAGGAAA TCCGCACGCA GAATCTGACA TTGGNNAGCC 

701 TCCCCCAAGC GGCATGA 

This encodes a protein having amino acid sequence [<SEQ ED 422>] (SEP ID NO: 422) : 

1 MNRPKQPFFR PEVAVARQTS LTGKVILTRP LSFSLWT TFA SISALLIILF 

51 LIFG NYTRKT ' TVEGQILPAS GVIRVYAPDT GTITAKFXED GEKVKAGDKL 

101 FALSTSRFGA GDSVQQQLKT EAVLKKTLAE QELGRLKLIH GNETRSLKAT 

151 VERLENQELH iSQQIDGQKR RIRLAEEMLQ KYRFLSANDA VPKQEMMNVK 

201 AELLEQKAKL DAYRREEVGL LQEIRTQNLT LXSLPQAA* 

Homology with a predicted ORF from 7V. gonorrhoeae 

ORF107 (SEP ID NO: 420) shows 95.7% identity over a 188aa overlap with a predicted PRF 
(PRF1 07.ng) (SEP ID NP: 424) from N. gonorrhoeae: 

orf 107. pep MNRPKQPFFRPEVAVARQTSLTGKVILTRPLSFSLWTTFASISALLIILFLIFGNYTRKT 60 

MM MMMMMMMMMM MMMMMMMMMMMM I MMMI 

orf 107ng MNRPKQPFFRPEVAIARQTSLTGKVILTRPLSFSLWTTFASISALLIILFLIFGNYTRKT 60 

orf 107 . pep TVEGQ I LPASGVIRVYAPDTXTI TAKFVEDGXKVKAGDKLFALSTSRFGAGGSVQQQLKT 120 

hllllllllllllllllll llllllllll llllllllllllllllllllllllllll 
orf 107ng TMEGQILPASGVIRVYAPDTGTITAKFVEDGEKVKAGDKLFALSTSRFGAGGSVQQQLKT 120 

orf 107 . pep EAVLKKTLAEQELGRLKL I HGNETRSLKATVERLENQELH I SQQ I DGQKRRI RLAEEMLQ 180 

IIIIIIIIIIIIIIIMIII MM MMMMMMMMMMM MMMMM 

or f 1 0 7 ng EAVLKKTLAEQELGRLKL I HENETRS LKATVERLENQKLH I S QQ I DGQ KRR I RLAEEMLR 180 

orf 107. pep KYRFLSXQ 188 

llllll I 
orfl07ng KYRFLSAQ 188 

The complete length PRF107ng nucleotide sequence [<SEQ ID 423>] (SEP ID NP: 423) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 424>] (SEP ID NP: 424) : 



1 MNRPKQPFFR PEVAIARQTS LTGKVILTRP LSFSLWT TFA SISALLIILF 

51 LIFG NYTRKT TMEGQILPAS GVIRVYAPDT GTITAKFVED GEKVKAGDKL 

101 FALSTSRFGA GGSVQQQLKT EAVLKKTLAE QELGRLKLIH ENETRSLKAT 

151 VERLENQKLH ISQQIDGQKR R I RLAEEMLR KYRFLSAQ* 
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Based on the presence of a putative ransmembrane domain in the gonococcal protein, it is predicted 
that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful 
antigens for vaccines or diagnostics, or for raising antibodies. 

Example 50 

The following DNA sequence, believed to be complete, was identified in N. meningitidis [<SEQ ID 
425>] (SEP ID NO: 425) : 

1 ATGCTGAATA CTTTTTTTGC CGTATTGGGC GGCTGCCTGC TGCT . TTGCC 

51 GTGCGGCAAA TCCGTAAATA CGGCGGTACA GCCGCAAAAC GCGGTACAAA 

101 GCGCGCCGAA ACCGGTTTTC AAAGTCATAT ATATCGACAA TACGGCGATT 

151 GCCGGTTTGG ATTTGGGACA AAGCAGCGAA GGCAAAACCA ACGACGGCAA 

201 AAAACAAATC AGTTATCCGA TTAAAGGCTT GCCGGAACAA AATGTTATCC 

251 GACTGATCGG CAAGCATCCC GGCGACTTGG AAGCCGTCAG CGGCAAATGT 

3 01 ATGGAAACCG ATGATAAGGA CAGTCCGGCA GGTTGGGCAG AAAACGGCGT 

351 GTGCCATACC TTGTTTGCCA AACTGGTGGG CAATATCGCC GAAGACGGCG 

401 GCAAACTGAC GGATTACCTA GTTTCGCATG CCGCCCTGCA ACCCTATCAG 

451 . GCAGGCAAAA GCGGCTATGC CGCCGTGCAG AACGGACGCT ATGTGCTGGA 

501 AATCGACAGC GAAGGGGCGT TTTATTTCCG CCGCCGCCAT TATTGA 

This corresponds to the amino acid sequence [<SEQ ID 426; ORF108>] (SEP ID NO: 426; 
ORF108) : 

1 MLNTFFAVLG GCLLXLPCGK SVNTAVQPQN AVQSAPKPVF KVIYIDNTAI 
51 AGLDLGQSSE GKTNDGKKQI SYPIKGLPEQ NVIRLIGKHP GDLEAVSGKC 
101 METDDKDSPA GWAENGVCHT LFAKLVGNIA EDGGKLTDYL ^SHAALQPYQ 
151 AGKSGYAAVQ NGRYVLEIDS EGAFYFRRRH Y* 

Further work revealed the following DNA sequence [<SEQ ID 427>] (SEP ID NO: 427) : 



1 ATGCTGAAAA CATCTTTTGC CGTATTGGGC GGCTGCCTGC TGCTTGCCGC 

51 CTGCGGCAAA TCCGAAAATA CGGCGGAACA GCCGCAAAAC GCGGTACAAA 

101 GCGCGCCGAA ACCGGTTTTC AAAGTCAAAT ATATCGACAA TACGGCGATT 

151 GCCGGTTTGG ATTTGGGACA AAGCAGCGAA GGCAAAACCA ACGACGGCAA 

201 AAAACAAATC AGTTATCCGA TTAAAGGCTT GCCGGAACAA AATGTTATCC 

2 51 GACTGATCGG CAAGCATCCC GGCGACTTGG AAGCCGTCAG CGGCAAATGT 

3 01 ATGGAAACCG ATGATAAGGA CAGTCCGGCA GGTTGGGCAG AAAACGGCGT 
351 GTGCCATACC TTGTTTGCCA AACTGGTGGG CAATATCGCC GAAGACGGCG 

4 01 GCAAACTGAC GGATTACCTA GTTTCGCATG CCGCCCTGCA ACCCTATCAG 
4 51 GCAGGCAAAA GCGGCTATGC CGCCGTGCAG AACGGACGCT ATGTGCTGGA 
501 AATCGACAGC GAAGGGGCGT TTTATTTCCG CCGCCGCCAT TATTGA 

This corresponds to the amino acid sequence [<SEQ ID 428; PRF108-1>] (SEP ID NP: 428; 
PRF108-1): 



1 MLKTSFAVLG GCLLLAA CGK SENTAEQPQN AVQSAPKPVF KVKYIDNTAI 
51 AGLDLGQSSE GKTNDGKKQI SYPIKGLPEQ NVIRLIGKHP GDLEAVSGKC 
101 METDDKDSPA GWAENGVCHT LFAKLVGNIA EDGGKLTDYL VSHAALQPYQ 
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151 AGKSGYAAVQ NGRYVLEIDS EGAFYFRRRH Y* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with a predicted ORF from N. gonorrhoeae 

5 ORF108 (SEP ID NO: 426) shows 88.4% identity over a 181aa overlap with a predicted ORF 
(ORF1 08.ng) (SEP ID NO: 430) from N. gonorrhoeae: 

orf 108 . pep MLNTFFAVLGGCLLXLPCGKSVNTAVQPQNAVQSAPKPVFKVIYIDNTAIAGLDLGQSSE 60 

Ih I Mill II I MM III II II hill III II 1 1 1 1 II II MM I Mill 

or f 1 0 8ng MLKI PFAVLGGCLLLAACGKSENTAEQPQNAAQSAPKPVFKVKYIDNTAIAGLALGQSSE 6 0 

10 orf 108 .pep GKTNDGKKQISYPIKGLPEQNVIRLIGKHPGDLEAVSGKCMETDDKDSPAGWAENGVCHT 120 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 - 1 1 lllhlllll MM IMMMMMIMI 

or f 1 0 8 ng GKTNDGKKQI S YP I KGLPEQNAVRLTGKHPNDLEAWGKCMETDGKDAPSGWAENGVCHT 120 

orf 108 . pep LFAKLVGN I AEDGGKLTDYLVSHAALQP YQAGKSGYAAVQNGRYVLE I DS EGAFYFRRRH Y 181 

IIMIIIIIIIIIIIIIIIMIMIIIIMIIIIIIIIIMIIIIIMIIIMIIIIIIII 

15 orfl08ng LFAKLVGN I AEDGGKLTD YL I S HSALQP YQAGKSGYAAVQNGRYVLE IDS EGAFYFRRRH Y 181 

PRF108-1 (SEP ID NP: 428) shows 92.3% identity with PRF108ng (SEP ID NP: 430) over the 
same 181 aa overlap: 

orf 108 - 1 . pep MLKTSFAVLGGCLLLAACGKSENTAEQPQNAVQSAPKPVFKVKYIDNTAIAGLDLGQSSE 60 

III IMMIMIMMIMMMMMIMMMMIIMMIMIMMI MMM 

20 orf 108ng- 1 MLKIPFAVLGGCLLLAACGKSENTAEQPQNAAQSAPKPVFKVKYIDNTAIAGLALGQSSE 60 

orf 108-1 .pep GKTNDGKKQISYPIKGLPEQNVIRLIGKHPGDLEAVSGKCMETDDKDSPAGWAENGVCHT 120 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I MM Ml MMM I IMMMMMIMI 

orf 108ng- 1 GKTNDGKKQISYPIKGLPEQNAVRLTGKHPNDLEAWGKCMETDGKDAPSGWAENGVCHT 120 

orf 108-1. pep LFAKLVGN I AEDGGKLTDYLVSHAALQP YQAGKSGYAAVQNGRYVLE I DSEGAFYFRRRHY 181 

25 I I I I I I I I I I I I I I I I I I I I : I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

orf 108ng- 1 L FAKLVGN I AEDGGKLTD YL I SHSALQP YQAGKSGYAAVQNGRYVLE I DSEGAFYFRRRHY 181. 

The complete length PRF1 08ng nucleotide sequence [<SEQ ED 429>] (SEP ID NP: 429) is: 

1 ATGCTGAAAa tacctTTTGC CGTGTtgggc ggCtgcctGC TGCTTGCCGC 

30 51 CTGCGGCAAA TCCGAAAATa cggcggaACA GCCGCAAAAT gcggCACAAA 

101 GCGCGCCGAA ACCGGTTTTC AAAGTCAAAT ACATCGACAA TACGGCGATT 

151 GCCGGTTTGG CTTTGGGACA AAGTAGCGAA GGCAAAACCA acgacgGCAA 

201 AAAACAAATC AGTTATccgA TTAAAGGCTT GCCGGAACAA Aacgccgtcc 

251 gGCTGACCGG AAAGCATCCC AACGACTTGG AagccgtcgT CGGCAAATGT 

35 ' 301 ATGGAAACCG ACGGAAAGGA CGCGCCTTCG GGCTGGGCGG AAAACGGCGT 

351 GTGCCATACC TTGTTTGCCA AACTGGTGGG CAATATCGCC GAAGACGGCG 

4 01 GCAAACTGAC TGATTACCTG ATTTCGCATT CCGCCCTGCA ACCCTATCAG 

4 51 GCAGGCAAAA GCGGCTATGC CGCCGTGCAG AACGGACGCT ATGTGCTGGA 

501 AATCGACAGC GagggGGCGT TTTATttccg ccgccgccat tattgA 

40 

This encodes a protein having amino acid sequence [<SEQ ID 430>] (SEP ID NP: 430) : 
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1 MLKIPFA VLG GCLLLAAC GK SENTAEQPQN AAQSAPKPVF KVKYIDNTAI 

51 AGLAL GOSSE GKTN DGKKQI SYPIKGLPEQ NAVRLTGKHP NDLEAWGKC 

101 METDGKDAPS GWAENGVCHT LFAKLVGNIA EDGGKLTDYL ISHSALQPYQ 

151 AGKSGYAAVQ NGRYVLEIDS EGAFYFRRRH Y* 

Based on this analysis, including the presence of a predicted prokaryotic membrane lipoprotein 
lipid attachment site (underlined) and a putative ATP/GTP-binding site motif A (P-loop, double- 
underlined) in the gonococcal protein, it is predicted that the proteins from N. meningitidis and 
N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
raising antibodies. 



Example 51 



The following DNA sequence was identified in N. meningitidis [<SEQ ID 43 1>] (SEO ID NO: 
431) : 



1 ATGGAAGATT TATATATAAT ACTCGCTTTG GGTTTGGTTG CGATGATTGC 

51 CGgATTTATC GATgcgatTg cGggCGGGGG TGGTTTGATT ACGCTGCCCG 

101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG 

151 CTGCAAgCAG CCGCTGCTAC GTTTTCAGCT ACGGTTTCTT TTGCACGCAA 

201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCCGCA GCATCGTTTG 

251 TAGGCGGCGT GGcCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT 

3 01 CTgCTgGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCAC TGTATTTTGT 

351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT 

401 TTTTTCTGTT cGGGCTGACG GTCGC . ACCG . CTTTTGGGTT TTTACGACGG 

451 TGTGTTCGGA CCGGGTGTCG GCTCGTTTTT TCTGATTGCC TTTATTGTTT 

501 TGCTCGGCTG CAAgCTGTTG AACGCGATGT CTTACACCAA ATTGGCGAAC 

551 GTTGCCTGCA ATCTTGGTTC GCTATCGGTA TTCCTGCTGC ACGGTTCGAT 

601 TATTTTCCCG ATTGCGGCAA CGaTGGCGGT CGGTGCGTTT GTCGGtGCGA 

651 ATTTAgGTGC GAGATTTGCC GTaCgctTCG GTTCGAAGCT GATTAA 

This corresponds to the amino acid sequence [<SEQ ID 432; ORF109>] (SEO ID NO: 432; 
ORF109) : 



1 MEDLYIILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK 

51 LQAAAATFSA TVSFARKGLI DWKKGLP I AA ASFVGGVAGA LSVSLVSKDI 

101 LLAWPVLLI FVALYFVFSP KLDGSKEGKA RMSFFLFGLT VXTAFGFLRR 

151 /CVRTGCRLVF SDCLYCFARL QAVERDVLHQ IGERCLQSWF AIGIPAARFD 

2 01 YFPDCGNDGG RCVCRCEFRC EICRTLRFEA D* 

Further work revealed the following DNA sequence [<SEQ ID 433>] fSEO ID NO: 433) : 



1 ATGGAAGATT TATATATAAT ACTCGCTTTG GGTTTGGTTG CGATGATTGC 

51 CGGATTTATC GATGCGATTG CGGGCGGGGG TGGTTTGATT ACGCTGCCCG 

101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG 

151 CTGCAAGCAG CCGCTGCTAC GTTTTCAGCT ACGGTTTCTT TTGCACGCAA 

201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCCGCA GCATCGTTTG 

251 TAGGCGGCGT GGCCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT 

3 01 CTGCTGGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCAC TGTATTTTGT 
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351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT 

4 01 TTTTTCTGTT CGGGCTGACG GTCGCACCGC TTTTGGGTTT TTACGACGGT 

4 51 GTGTTCGGAC CGGGTGTCGG CTCGTTTTTT CTGATTGCCT TTATTGTTTT 

501 GCTCGGCTGC AAGCTGTTGA ACGCGATGTC TTACACCAAA TTGGCGAACG 

551 TTGCCTGCAA TCTTGGTTCG CTATCGGTAT TCCTGCTGCA CGGTTCGATT 

601 ATTTTCCCGA TTGCGGCAAC GATGGCGGTC GGTGCGTTTG TCGGTGCGAA 

651 TTTAGGTGCG AGATTTGCCG TCCGCTTCGG TTCGAAGCTG ATTAAGCCGC 

701 TGCTGATTGT CATCAGCATT TCGATGGCTG TGAAATTGTT GATAGACGAG 

751 AGAAATCCGC TGTATCAGAT GATTGTTTCG ATGTTTTAA 

This corresponds to the amino acid sequence [<SEQ ID 434; ORF109-1>] (SEP ID NO: 434; 
PRF109-1) : 



1 MEDLY I I LAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK 

.51 LQAAAATFSA TVSFARKGLI DWKKGLPI AA AS FVGGVAGA LSVSLV SKDI 

101 LLAWPVLLI FVALYFVFSP KLDGSKEGKA RMSFFLFGLT VAPLLGFYDG 

151 VFGPG VGSFF LIAFIVLLGC KL LNAMSYTK LANVACNLGS LSVFLLHGSI 

201 IFPIAATMAV GAFVGANLGA RFAVRFGSKL IKPLLIVISI SMAVKLLIDE 

251 RNPLYQMIVS MF* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N. meningitidis (strain A) 



ORF109 (SEP ID NO: 432) shows 95.9% identity over a 147aa overlap with an ORF (ORF109a) 
(SEP ID NO: 436) from strain A of N. meningitidis: 



10 20 30 40 50 60 

MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 

1 1 I M 1 1 M I ! I M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I M I I 

MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 
10 20 30 40 50 60 

70 80 90 100 110 120 

TVS FARKGL I DWKKGL P I AAAS FVGGVAGALS VS LVS KD I LLAWPVLL I FVALY FVFSP 

I I M I I I I I I I I I I I I I I I I I I I : I M : M I I I 11 I I I I I I I I I M I I I I I I I I I I I I I I 
TVS FARKGL I DWKKGLP I AAAS FAGGWGALS VS LVS KD I LLAWPVLL I FVALYFVFS P 
70 80 90 100 110 120 

130 140 150 160 170 180 

KLDGSKEGKARMSFFLFGLTVXTAFGFLRRCVRTGCRLVFSDCLYCFARLQAVERDVLHQ 

Illllllllllllllllllll =11 

KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK 
130 140 150 160 170 180 

The complete length PRF109a nucleotide sequence [<SEQ ID 435>] (SEP ID NP: 435) is: 



ATGGAAGATT TATACATAAT ACTCGCTTTG GGTTTGGTTG CGATGATTGC 
CGGATTTATC GATGCGATTG CGGGTGGGGG TGGTTTGATT ACGCTGCCTG 
CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG 
CTGCAAGCAG CCGCTGCTAC GTTTTCGGCT ACGGTTTCTT TTGCACGCAA 
AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCGGCA GCATCGTTTG 
CAGGCGGCGT GGTCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT 



orf 109 .pep 
orf 109a 

orf 109 .pep 
orf 109a 

orf 109 .pep 
orf 109a 



1 
51 
101 
151 
201 
251 
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301 CTGCTGGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCGC TGTATTTTGT 
351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT 
401 TTTTTCTGTT CGGTCTGACG GTTGCACCAC TTTTGGGTTT TTACGACGGT 
4 51 GTGTTCGGAC CGGGTGTCGG CTCGTTTTTT CTGATTGCCT TTATTGTTTT 
5 501 GCTCGGCTGC AAGCTGTTGA ACGCGATGTC TTACACCAAA TTGGCGAACG 

551 TTGCCTGCAA TCTTGGTTCG CTATCGGTAT TCCTGCTGCA CGGTTCGATT 
601 ATTTTCCCGA TTGCGGCAAC GATGGCGGTC GGTGCGTTTG TCGGTGCGAA 
651 TTTAGGTGCG AGATTTGCCG TCCGCTTCGG TTCGAAGCTG ATTAAGCCGC 
701 TGCTGATTGT CATCAGCATT TCGATGGCTG TGAAATTGTT GATAGACGAG 
10 751 AGAAATCCGC TGTATCAGAT GATTGTTTCG ATGTTTTAA 



This encodes a protein having amino acid sequence [<SEQ ID 436>] (SEP ID NO: 436) : 



1 MEDLYIILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK 

51 LQAAAATFSA TVSFARKGLI DWKKGLPI AA ASFAGGWGA LSVSLV SKDI 

15 101 LLAWPVLLI FVALYFVFSP KLDGSKEGKA RMSFFLFGLT VAPLLGFYDG 

151 VFGPG VGSFF LIAFIVLLGC KL LNAMSYTK LANVACNLGS LSVFLLHGSI 

201 IFPIAATMAV GAFVGANLGA RFAVRFGSKL IKPLLIVISI SMAVKLLIDE 

251 RNPLYQMIVS MF* 

20 ORF109a (SEP ID NO: 436) and ORF109-1 (SEP ID NO: 434) show 99.2% identity in 262 aa 
overlap: 



10 20 30 40 50 60 

orf 109a. pep MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 

M 1 1 M 1 1 1 II 1 1 1 1 1 II II 1 1 II M 1 1 1 1 M 1 1 1 1 M 1 1 1 1 II I II III 1 1 M 1 1 1 

25 orf 109-1 MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 109a . pep TVS FARKGL I DWKKGLP I AAASFAGGWGALSVSLVSKD I LLAWPVLLI FVALYFVFSP 

III II II II I II II II II II Ml MM III MM II II II III MIMM II I II 

30 orf 109-1 TVS FARKGL I DWKKGLP I AAAS FVGGVAGALS VSLVS KD I LLAWPVLL I FVALYFVFS P 

70 80 90 100 110 120 



130 140 150 160 170 180 

orf 10 9a . pep KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK 

I III I Ml 1 1 1 1 1 II 1 1 II 1 1 . II II II II 1 1 1 1 i II 1 1 II III II II 1 1 1 1 1 1 1 1 

35 orf 109-1 KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK 

130 140 150 160 170 180 



190 200 210 220 230 240 

orf 109a. pep LANVACNLGSLSVFLLHGSIIFPIAATMAVGAFVGANLGARFAVRFGSKLIKPLLIVISI 

I Ml 1 1 1 1 1 1 1 1 1 II > 1 1 II II Ml II II II 1 1 III II II II IN M III 1 1 II I IN 

40 orf 109- 1 LANVACNLGS LSVFLLHGS 1 1 FPIAATMAVGAFVGANLGARFAVRFGSKLI KPLLIVIS I 

190 200 210 220 230 240 



250 260 
orf 109a. pep SMAVKLLIDERNPLYQMI VSMFX 

IMIMMMIMIM Mill 

45 orf 109-1 SMAVKLLIDERNPLYQMI VSMFX 

250 260 



Homology with a predicted PRF from N.sonorrhoeae 



CHIR-01 60 (356.001 ) PATENT 

-341- 

ORF109 (SEP ID NO: 432) shows 98.3% identity over a 231aa overlap with a predicted ORF 
(ORF1 09.ng) (SEP ID NO: 438) from N. gonorrhoeae: 

orf 109 .pep MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 60 

I I I M I I I I II I I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I II I I I I I I I I I 
orf 109ng MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 60 

orf 109. pep TVS FARKGL IDWKKGLP I AAAS FVGGVAGALS VS LVS KD I LLAWPVLL I FVALYFVFS P 120 

Mill II Ml II Mill Ml II IMIIMII II MM Mill Mill I I Mill 

orfl09ng TVSFARKGLIDWKKGLPIAAASFAGGWGALSVSLVSKDILLAWPVLLIFVALYFVFSP 120 

orf 109. pep KLDGSKEGKARMS FFLFGLTVXTAFGFLRRCVRTGCRLVFSDCLYCFARLQAVERDVLHQ 180 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 1 0 9ng KLDGSKEGKARMS FFLFGLTVATAFGFLRRCVRTGCRLVFSDCLYCFARLQAVERDVLHQ 180 

orf 109 .pep I GERCLQS WFA I G I PAARFDYFPDCGNDGGRCVCRCE FRCE I CRTLRFEAD 231 

MMMMMMMMIMMMIMIMM III I Mill llllll 

orfl09ng I GERCLQS WFA I G I P AARFD YF PDCGNDGGRCVCRCE FRCE I CRPLRFEAD 231 

An ORF109ng nucleotide sequence [<SEQ ID 437>] (SEP ID NO: 437) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 438>] (SEP ID NO: 438) : 

1 MEDLYI ILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK 

51 LQAAAATFSA TVS FARKGL I DWKKGLPIA A AS FAGGWGA LSVSLV SKDI 

101 LLAWPVLLI FVALYFVFS P KLDGSKEGKA RMSFFLFGLT VATAFGFLRR 

151 CVRTGCRLVF SDCLYCFARL QAVERDVLHQ IGERCLQSWF AIGIPAARFD 

201 YFPDCGNDGG RCVCRCEFRC EICRPLRFEA D* 

Further work revealed the following gonococcal DNA sequence [<SEQ ID 439>] (SEP ID NO: 
439) : 



1 ATGGAAGATT TATACATAAT ACTCGCTTTG GGTTTGGTTG CGATGATCGC 

51 CGGATTTATC GATGCGATTG CGGGCGGGGG TGGTTTGATT ACGCTGCCTG 

101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG 

151 CTGCAAGCAG CCGCTGCTAC GTTTTCGGCT ACGGTTTCTT TTGCACGCAA 

201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCCGCA GCATCGTTTG 

251 CAGGCGGCGT GGTCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT 

301 TTGCTGGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCGC TGTATTTTGT 

351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT 

401 TTTTTCTATT CGGGCTGACG GTTGCACCGC TTTTGGGTTT TTACGACGGT 

451 GTGTTCGGAC CGGGTGTCGG CTCGTTTTTT CTGATTGCCT TTATTGTTTT 

501 GCTCGGCTGC AAGCTGTTGA ACGCGATGTC TTACACCAAA TTGGCGAACG 

551 TTGCTTGCAA TCTTGGTTCG CTATCGGTAT TCCTGCTGCA CGGTTCGATT 

601 ATTTTCCCGA TTGTGGCAAC GATGGCGGTC GGTGCGTTTG TCGGTGCGAA 

651 TTTAGGTGCG AGATTTGCCG TCCGCTTCGG TTCGAAGCTG ATTAAGCCGC 

701 TGCTGATTGT CATCAGCATT TCGATGGCTG TGAAATTGTT GATAGACGAG 

751 AGAAATCCGC TGTATCAGAT GATTGTTTCG ATGTTTTAA 

This corresponds to the amino acid sequence [<SEQ ID 440; ORF109ng-l>] fSEO ID NO: 440: 
ORF109ng-l) : 



1 MEDLYI ILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK 
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51 LQAAAATFSA TVSFARKGLI DWKKGLPI AA' ASFAGGWGA LSVSLV SKDI 
101 LLAWPVLLI FVALYFVFSP KLDGSKEGKA RMSFFLFGLT VAPLLGFYDG 
151 VFGPG VGSFF LIAFIVLLGC KL LNAMSYTK LANVACNLGS LSVFLLHGSI 
201 IFPIVATMAV GAFVGANLGA RFAVRFGSKL IKPLLIVISI SMAVKLLIDE 
251 RNPLYQMIVS MF* 

ORF109ng-l (SEP ID NO: 440) and PRF109-1 (SEP ID NO: 434) show 98.9% identity in 262 aa 
overlap: 

10 20 30 40 50 60 

orf 109ng- 1 . pep MEDLYI ILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 

1 1 1 N 1 1 1 II I i 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 i 1 1 1 1 M 1 1 1 1 1 1 1 1 

orf 109-1 MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 109ng- 1 . pep TVS FARKGL I DWKKGLPIAAASFAGGWGALSVSLVSKD I LLAWPVLLI FVALYFVFSP 

I M M I I I I I I I I I M i I I I I M I hi II I I I I I I I I ' I I I I I I I I I I I I II I I I M 
orf 109-1 TVS FARKGL I DWKKGLP I AAAS FVGGVAGALS VSLVS KD I LLAWP VLL I FVALYFVFS P 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 109ng-l.pep KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK 
I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
orf 10 9-1 KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 1 0 9ng - 1 . pep LANVACNLGSLSVFLLHGS 1 1 FPI VATMAVGAFVGANLGARFAVRFGSKLI KPLLI VI S I 

I I 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 I hi INI 1 1 1 1 1 1 1 II ■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 109-1 LANVACNLGSLSVFLLHGS 1 1 FPIAATMAVGAFVGANLGARFAVRFGSKLI KPLLI VIS I 

190 200 210 220 230 240 

250 260 
orf 109ng-l .pep SMAVKLLIDERNPLYQMIVSMFX 

IIHillllMI MIIIIIM 

or f 10 9 - 1 SMAVKLLIDERNPLYQMIVSMFX 

250 260 

In addition, ORF109ng-l (SEP ID NO: 440) shows homology to a hypothetical Pseudomonas 
protein (SEPIDNP: 1140) : 

sp|P2 9942 | YCB9_PSEDE HYPOTHETICAL 27.4 KD PROTEIN IN COBO 3 ' REGION (ORF9) 
)gi | 94984 |pir | | 138164 hypothetical protein 9 - Pseudomonas sp )gi|551929 (M62866) 
ORF9 [Pseudomonas denitrif icans] Length = 261 
Score = 175 bits (439), Expect = 3e-43 

Identities = 83/214 (38%), Positives = 131/214 (60%), Gaps = 1/214 (0%) 

Query: 41 PPVSAIATNKLQXXXXXXXXXXXXXRKGLIDWKKGLPIXXXXXXXXXXXXXXXXXXXKDI 100 

PP+ + TNKLQ R+G ++ K+ LP+ D+ 

Sbjct: 4 3 PPLQTLGTNKLQGLFGSGSATLSYARRGHVNLKEQLPMALMSAAGAVLGALLATIVPGDV 102 

Query: 101 LLAWPVLLI FVALYFVFS PKLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFF 160 

L A++P LLI +ALYF P + G + +R++ F+F LT+ PL+GFYDGVFGPG GSFF 
Sbjct: 103 LKAILPFLLIAIALYFGLKPNM-GDVDQHSRVTPFVFTLTLVPLIGFYDGVFGPGTGSFF 161 
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Query: 161 LI AFI VLLGCKLLNAMSYTKLANVACNLGSLSVFLLHGS 1 1 FPI VATMAVGAFVGANLGA 220 

++ F+ L G +L A ++TK N N+G+ VFL G++++ + M +G F+GA +G+ 
Sbjct: 162 MLGFVTLAGFGVLKATAHTKFLNFGSNVGAFGVFLFFGAVLWKVGLLMGLGQFLGAQVGS 221 

Query: 221 RFAVRFGSKLIKPLLIVISISMAVKLLIDERNPL 254 

R+A+ G+K+IKPLL+++SI++A++LL D +PL 
Sbjct: 222 RYAMAKGAKI I KPLLVI VS I ALAI RLLADPTHPL 255 

Based on this analysis, including the presence of a putative leader sequence (double-underlined) 
and several putative transmembrane domains (single-underlined) in the gonococcal protein, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 52 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 44 1>] (SEP ID 
NO: 441) : 

1 . . CTGCTAGGGT ATTGCATCGG TTATCGGTAC GgCTGTTGCA GCAAAACCAG 

51 CCGCAGACGG ATTATTTGGT CAAATTCGGA TCGTTTTGGG CGAG . ATTTT 

101 TGGTTTTCTG GGACTGTATG ACGTCTATGC TTCGGCATGG TTTGTCGTTA 

151 TCATGATGTT TTTGGTGGTT TCTACCAGTT TGTGCCTGAT TCGCAATGTG 

201 CCGCCGTTCT GGCGCGAAAT GAAGTCTTTT CGGGAAAAGG TTAAAGAAAA 

251 ATCTCTGGCG GCGATGCGCC ATTCTTCGCT GTTGGATGTA AAAATTGCGC 

301 CCGAGGTTGC CAAACGTTAT CTGGAAGTAC AAGGTTTTCA GGGGAAAACC 

351 ATTAACCGTG AAGACGGGTC GGTTCTGATT GCCGCCAAAA AAGGCACAAT 

4 01 GAACAAATGG GGCTATATCT TTGCCCATGT TGCTTTGATT GTCATTTGCC 

451 TGGGCGGGTT GATAGACAGT AACCTGCTGT TGAAACTGGG TATGCTGACC 

501 GGTCGGATTg TTCCGGACAA TCAGGCGGTT TATGCCAAGG ATTTC.AAGC 

551 CCGAAAGTAT . TTTGGGTGC gTCCAATCTC TCATTTAGGG GCAACGTCAA 

601 TATTTCCG . A GGGGCAGAgT GCGGATGTGG TTTTCCTGA 

This corresponds to the amino acid sequence [<SEQ ID 442; ORF110>] fSEO ID NO: 442: 
ORF110) : 

1 . . LLGIASVIGT LLQQNQPQTD YLVKFGSFWA XIFGFLGLYD VYASAWFWI 

51 MMFLWSTSL CLIRNVPPFW REMKSFREKV KEKSLAAMRH SSLLDVKIAP 

101 EVAKRYLEVQ GFQGKTINRE DGSVLIAAKK GTMNKWGYIF AHVALIVICL 

151 GGLIDSNLLL KLGMLTGRI F RTIRRFMPRI XKPESXFGCV QSLI*GQRQY 

201 FXRGRVRMWF S* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with ORF88a from N. meningitidis (strain A) 



ORF110 (SEP ID NO: 442) shows 91.5% identity over a 188aa overlap with ORF88a (SEO ID 
NO: 332) from strain A of N. meningitidis: 
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10 20 30 40 50 60 

or f 8 8a. pep MSKSRRSPPLLSRPWFAFFSSMRFA VALLSLLGIASVIGTVLQ QNQPQTDYLVKFGSFWA 

I i ' I I i I I M I I I I I I I I I I I II I I I I 
or f 1 1 0 LLGIASVIGTLL QQNQPQTDYLVKFGSFWA' 
5 10 20 30 

70 80 90 100 110 120 

orf 88a . pep QIFGFLGLYDVYASAW FWIMMFLWSTSLCLI RNVPPFWREMKSFREKVKEKSLAAMRH 
I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 M I I I I I II 
orf 110 XIFGFLGLYDVYASAW FWIMMFLVVSTSLCLI RNVPPFWREMKSFREKVKEKSLAAMRH 
10 40 50 60 70 80 90 

130 140 150 160 170 180 

orf 8 8a. pep S S LLDVKI APE VAKR YLE VQGFQGKT INREDGS VL I AAKKGTMNKWG Y I FAHVAL IVICL 

llllllllll lllllllllllllllllillll IIIIIIIIIIIIIIMI llllllll 

orf 110 . SSLLDVKI APEVAKRYLEVQGFQGKTINREDGSVLI AAKKGTMNKWG Y I FAHVALIVICL 

15 100 110 120 130 140 150 

190 200 210 220 230 240 

orf 88a . pep GGLI DSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADWF 

MM! MM MINIM : : : I II I M 

orf 110 GGLIDSNLLLKLGMLTGRI FRTIRRFMPRIXKPESXFGCVQSLIXGQRQYFXRGRVRMWF 

20 160 170 180 19.0 200 210 

250 260 270 280 290 300 

orf 88a. pep LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 



25 



orf 110 SX 

However, ORF88 (SEP ID NO: 328) and ORF1 10 (SEP ID NO: 442) do not align, because they 
represent two different fragments of the same protein. 

Homology with a predicted ORF from N. gonorrhoeae 

ORF110 (SEP ID NO: 442) shows 88.6% identity over a 211aa overlap with a predicted ORF 
30 (PRF1 lO.ng) (SEP ID NP: 444) from N. gonorrhoeae: 



orf 110 .pep LLGIASVIGTLLQQNQPQTDYLVKFGSFWA 3 0 

MMMIMhllMMIMIMIM M: 

orf HOng MSKSRISPTLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGPFWT 60 

orf 110 .pep XIFGFLGLYDVYASAWFWIMMFLWSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 90 
35 | | | | | | | || | | | | | | | | | M | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I 

orf HOng RIFDFLGLYDVYASAWFWIMMFLWSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 12 0 

orf 110 .pep SSLLDVKI APEVAKRYLEVQGFQGKTINREDGSVLI AAKKGTMNKWGYI FAHVALIVICL 150 

, 1 1 II 1 1 1 M 1 1 1 1 M 1 1 M II 1 1 l-l 1 1 1 1 1 1 1 1 1 M M M 1 1 1 1 llllllllll 

orf HOng SSLLDVKI APEVAKRYLEVRGFQGKTVSREDGSVL I AAKKGTMNKWGY IXAHVAL IVICL 180 

40 orf 110 .pep GGLIDSNLLLKLGMLTGRI FRTIRRFMPRIXKPESXFGCVQSLIXGQRQYFXRGRVRMWF 210 

I lh lllllll M Ml: II I I I I I I I I H Mill I M I M IhlMII 

orf HOng GRLINXNLLLKLGMLAGSIFRNNRRVMPRISKPESIWGGVQSLIKGQRQYFQRGKVRMWF 240 
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orf 110 .pep 



s 



211 



orf HOng 



S 241 



The complete length ORFllOng nucleotide sequence [<SEQ ID 443>] (SEP ID NO: 443) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 444>] (SEP ID NO: 444) : 



1 MSKSRISPTL LSRPWFAFFS SMRFA VALLS LLGIASVIGT VL QQNQP.QTD 

51 YLVKFGPFWT RIFDFLGLYD VYASAW FWI MMFLWSTSL CLI RNVPPFW 

101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVR GFQGKTVSRE 

151 DGSVLIAAKK GTMNKWGYIX AHVALIVICL GRLINXN LLL KLGMLAGSIF 

201 RNNRRVMPRI SKPESIWGGV QSLIKGQRQY FQRGKVRMWF S* 



Based on the putative transmembrane domains in the gonococcal protein, it is predicted that the 
proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens for 
vaccines or diagnostics, or for raising antibodies. 



1 ATGCCGTCTG AAACACGCCT GCCGAACTTT ATCCGCGTCT TGATATTTGC 

51 CCTGGGTTTC ATCTTCCTGA ACGCCTGTTC GGAACAAACC GCGCAAACCG 

101 TTACCCTGCA AGGCGAAACG ATGGGCACGA CCTATACCGT CAAATACCTT 

151 TCAAATAATC GGGACAAACT CCCCTCACCT GCCGAAATAC AAAAACGCAT 

201 CGATGACGCG CTTAAAGAAG TCAACCGGCA GATGTCCACC TATCAGCCCG 

251 ACTCCGAAAT CAGCCGGTTC AACCAACACA CAGCCGGCAA GCCCCTCCGC 

3 01 ATTTCAAGCG ACTTCGCACA CGTTACTGCC GAAGCCGTCC GCCTGAACCG 

3 51 CCTGACACAC GGCGCGCTGG ACGTAACCGT CGGCCCCTTG GTCAACCTTT 

4 01 GGGGATTCGG CCCCGACAAA TCCGTTACCC GTGAACCGTC GCCGGAACAA 
4 51 ATCAAACAGG CGGCATCTTA TACGGGCATA GACAAAATCA TTTTGAAACA 
501 AGGCAAAGAT TACGCTTCCT TGAGCAAAAC CCACCCCAAG GCCTATTTGG 
551 ATTTATCTTC GATTGCCAAA GGCTTCGGCG TTGATAAAGT TGCGGGCGAA 
601 CTGGAAAAAT ACGGCATTCA AAATTATCTG GTCGAAATCG GCGGCGAGTT 
651 GCACGGCAAA GGCAAAAACG CGCGCGGCGA ACCGTGGCGC ATCGGTATCG 
701 AGCAGCCCAA TATCGTCCAA GGCGGCAATA CGCAGATTAT CGTCCCGCTG 
751 AACAACCGTT CGCTTGCCAC TTCCGGCGAT TACCGTATTT TCCACGTCGA 
801 TAAAAACGGC AAACGCCTCT CCCATATCAT CAACCCGAAC AACAAACGAC 
851 CCATCAGCCA CAACCTCGCC TCCATCAGCG TGGTCGCAGA CAGTGCGATG 
901 ACGGCGGACG GCTTGTCCAC AGGATTATTC GTATTGGGCG AAACCGAAGC 
951 CTTAAAGCTG GCAGAGCGCG AAAAACTCGC TGTTTTCCTG ATTGTCAGGG 

1001 ATAAAGGCGG CTACCGCACC GCCATGTCTT CCGAATTTGA AAAACTGCTC 

1051 CGCTAA 



This corresponds to the amino acid sequence [<SEQ ID 446; ORFlll>] fSEO ID NO: 446; 



1 MPSETRLPNF IRVLIFALGF IFLNA CSEQT AQTVTLQGET MGTTYTVKYL 
51 SNNRDKLPSP AEIQKRIDDA LKEVNRQMST YQPDSEISRF NQHTAGKPLR 



Example 53 



The following DNA sequence was identified in N. meningitidis 
445) : 



[<SEQ ID 445>] (SEP ID NO: 



ORF111) : 
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101 ISSDFAHVTA EAVRLNRLTH GALDVTVGPL VNLWGFGPDK SVTREPSPEQ 
151 IKQAASYTGI DKI ILKQGKD YASLSKTHPK AYLDLSSIAK GFGVDKVAGE 
. 2 01 LEKYGIQNYL VEIGGELHGK GKNARGEPWR IGIEQPNIVQ GGNTQIIVPL 
2 51 NNRSLATSGD YRIFHVDKNG KRLSHIINPN NKRPISHNLA SIS WADS AM 
5 301 TADGLSTGLF VLGETEALKL AEREKLAVFL IVRDKGGYRT AMSSEFEKLL 

351 R* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N.meninsitidis (strain A) 

ORF1 11 (SEP ID NO: 446) shows 96.9% identity over a 351 aa overlap with an ORF (ORF1 1 la) 
10 (SEP ED NO: 448) from strain A of N. meningitidis: 



10 20 30 40 50 60 

orf Ilia . pep MPSETRLPNFIRTLIFALSFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDXLPSP 

lllllilll MIMMIII llllllllllll MINIMUM IIM 1 1 1 1 

orf 111 MPSETRLPNFIRVLIFALGFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSP 
15 10 20 30 40 50 60 

70 80 90 100 110 120 

orf Ilia . pep AEIQXRIDDALKEVNRQMSTYQPDSEISRFNQHTAGKPLRISSDFAHVTAEAVHLNRLTH 

1 1 1 1 I Mil llllllllllll Mill lllllilll llllllllllll MM hll II II 

orf 111 AEIQKRIDDALKETORQMSTYQPDSEISRFNQHTAGKPLRISSDFAHVTAEAVRLNRLTH 
20 70 80 90 100 110 120 

130 140 150 160 170 180 

orf Ilia. pep GALDVTVGPLVNLWGFGPDKSVTREPSPEQ IKQAASYTGI DKI ILKQGKD YASLSKTHPK 

1 1 1 1 M 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 II 1 1 II 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 II 

orf 111 GALDVTVGPLVNLWGFGPDKSVTREPSPEQ IKQAASYTGI DKI ILKQGKD YASLSKTHPK 

25 130 140 150 160 170 180 

190 200 210 220 230 240 

orf Ilia .pep AYLDLSSIAKGFGVDXVAGELEKYGIQNYLVEIGGELHGKXKNARGEPWRIGIEQPNIVQ 

II 1 1 II II II 1 1 Ml II Illlllllllllll I Mill 1 1 MM II Mill II II II II 

orf 111 AYLDLSSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNARGEPWRIGIEQPNIVQ 
30 190 200 210 220 230 240 

250 260 270 280 290 300 

or f 1 1 1 a . pep GGNTQ I I VPLNNRSXATSGD YRI FHVDKSGKRLSH I INPNNKRP I SHNLAS I S VXADSAM 

Illlllllllllll I I I I I : 1 1 1 1 1 M II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IMM 

or f 1 1 1 GGNTQ 1 1 VPLNNRSLATSGDYRI FHVDKNGKRLSH I INPNNKRP I SHNLAS I SWADSAM 

35 250 260 270 280 290 300 

310 320 330 340 350 

orf Ilia . pep TADGXSTGLFVLGETEALKLAEREKLAVFLIVRDKGGYRTAMSSEFEKLLRX 

MM I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf ill TADGLSTGLFVLGETEALKLAEREKLAVFLIVRDKGGYRTAMSSEFEKLLRX 
40 310 320 330 340 350 

The complete length PRF1 1 1 a nucleotide sequence [<SEQ ID 447>] (SEP ID NP: 447) is: 



45 



i 

51 
101 



ATGCCGTCTG AAACACGCCT GCCGAACTTT ATCCGCACCT TGATATTTGC 
CCTGAGTTTT ATCTTCCTGA ACGCCTGTTC GGAACAAACC GCGCAAACCG 
TTACCCTGCA AGGTGAAACG ATGGGCACGA CCTATACCGT CAAATACCTT 
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151 TCAAATAATC GGGACNAACT CCCNTCACCT GCCGAAATAC AAAANCGCAT 

201 CGATGACGCG CTTAAAGAAG TCAACCGGCA GATGTCCACC TATCAGCCCG 

251 ACTCCGAAAT CAGCCGGTTC AACCAACACA CAGCCGGCAA GCCCCTCCGC 

301 ATTTCAAGCG ACTTCGCACA CGTTACTGCC GAAGCCGTCC ACCTGAACCG 

5 351 CCTGACACAC GGCGCGCTGG ACGTAACCGT CGGCCCCTTG GTCAACCTTT 

4 01 GGGGATTCGG CCCCGACAAA TCCGTTACCC GTGAACCGTC GCCGGAACAA 

4 51 ATCAAACAAG CAGCATCTTA TACGGGCATA GACAAAATCA TTTTGAAACA 

501 AGGCAAAGAT TACGCTTCCT TGAGCAAAAC CCACCCCAAG GCCTATTTGG 

551 ATTTATCTTC GATTGCCAAA GGCTTCGGCG TTGATNANGT TGCGGGCGAA 

10 601 CTGGAAAAAT ACGGCATTCA AAATTATCTG GTCGAAATCG GCGGNGAGTT 

651 GCACGGCAAA GNCAAAAACG CGCGCGGCGA ACCTTGGCGC ATCGGCATCG 

701 AACAGCCCAA CATCGTCCAA GGCGGCAATA CGCAGATTAT CGTCCCGCTG 

751 AACAACCGTT CGNTTGCCAC TTCCGGCGAT TACCGTATTT TCCACGTCGA 

801 TAAAAGCGGC AAACGCCTCT CCCATATCAT TAATCCGAAC AACAAACGAC 

15 851 CCATCAGCCA CAACCTCGCC TCCATCAGCG TGNTCGCAGA CAGTGCGATG 

901 ACGGCGGACG GCTTNTCCAC AGGATTATTC GTATTGGGCG AAACCGAAGC 

951 CTTAAAGCTG GCAGAGCGCG AAAAACTCGC TGTTTTCCTG ATTGTCAGGG 

1001 ATAAAGGCGG CTACCGCACC GCCATGTCTT CCGAATTTGA AAAACTGCTC 

1051 CGCTAA 



20 



This encodes a protein having amino acid sequence [<SEQ ID 448>] (SEP ID NO: 448) : 



1 MPSETRLPNF IRTLIFALSF IFLNA CSEQT AQTVTLQGET MGTTYTVKYL 

51 SNNRDXLPSP AEIQXRIDDA LKEVNRQMST YQPDSEISRF NQHTAGKPLR 

101 ISSDFAHVTA EAVHLNRLTH GALDVTVGPL VNLWGFGPDK SVTREPSPEQ 

25 151 IKQAASYTGI DKIILKQGKD YASLSKTHPK AYLDLSSIAK GFGVDXVAGE 

201 LEKYGIQNYL VEIGGELHGK XKNARGEPWR IGIEQPNIVQ GGNTQIIVPL 

251 NNRSXATSGD YRIFHVDKSG KRLSHIINPN NKRPISHNLA SISVXADSAM 

301 TADGXSTGLF VLGETEALKL AEREKLAVFL IVRDKGGYRT AMSSEFEKLL 

351 R* 

30 Homology with a predicted ORF from N. gonorrhoeae 



ORF111 (SEP ID NO: 446) shows 96.6% identity over a 351aa overlap with a predicted ORF 
(ORF1 1 1 .ng) fSEO ID NO: 450) from N. gonorrhoeae: 



10 20 30 40 50 60 

MPSETRLPNLIRALIFALGFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSP 

1 1 1 1 1 I I I I : I I : 1 1 1 M ■ I II I I I I I I I I I I M I I M I I I I I I I I M M I I I I I i I 
MPSETRLPNFIRVLI FALGFI FLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSP 
10 20 30 40 50 60 

70 80 90 100 110 120 

AKIQKRIDDALKEVNRQMSTYQTDSEISRFNQHTAGKPLRISSDFAHVTAEAVRLNRLTH 

hllllllllinillllllll Ml II 1 1 II 1 1 II II I II II II II II II II 1 1 MM I 

AEIQKRIDDALKEVNRQMSTYQPDSEISRFNQHTAGKPLRISSDFAHVTAEAVRLNRLTH 
70 80 90 100 110 120 

130 140 150 160 170 180 

GALDVTVGPL VNLWGFGPDKS VTRE PSPEQI KQAASYTG IDKII LQQGKD YAS LS KTH P K 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IM 1 1 1 1 1 1 1 M M M 1 1 M , I I MM 1 1 1 1 1 1 1 1 1 II I 

GALDVTVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILKQGKDYASLSKTHPK 
130 140 150 160 170 180 



orf lllng 

35 

orf 111 



orf ill 

40 

orf 111 



orf lllng 

45 

orf 111 



orf lllng 



190 200 210 220 230 240 

AYLDLSSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNAHGEPWRIGIEQPNIIQ 
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orf 111 




190 200 210 220 230 240 



orf lllng 



250 260 270 280 290 300 

GGNTQ 1 1 VPLNNRS LATSGDYR I FHVDKNGKRLSH I INPNNKRP I SHNLAS I SWSDSAM 



orf 111 




250 260 270 280 290 300 



orf lllng 



310 320 330 340 350 

TADGLS TGL FVLGETE ALRLAEQEKLAVFL I VRD KDGYRTAMS S E FAKLLRX 



orf 111 




310 320 330 340 350 



The complete length ORF1 1 1 ng nucleotide sequence [<SEQ ID 449>] (SEP ID NO: 449) is: 



1 ATGCCGTCTG AAACACGCCT GCCGAACCTT ATCCGCGCCT TGATATTTGC 

51 CCTGGGTTTC ATCTTCCTGA ACGCCTGTTC GGaacaaacC GCGCAaaccg 

101 TTACCCTGCA AGGCGAAAcg aTGGGTACGA CCTATACCGT CAAATACCTT 

151 TCAAATAATC GGGACAAACT CCCCTCCCCT GCCAAAATAC AAAAGCGCAT 

201 TGATGATGCG CTTAAAGAAG TCAACCGGCA GATGTCCACC TACCAGACCG 

251 ATTCCGAAAT CAGCCGGTTC AACCAACACA CAGCCGGCAA GCCCCTCCGC 

301 ATTTCAAGCG ATTTCGCACA CGTTACCGCC GAAGCCGTCC GCCTGAACCG 

3 51 CCTGACTCAC GGCGCACTGG ACGTAACCGT CGGCCCTTTG GTCAACCTTT 

401 GGGGGTTCGG CCCCGACAAA TCCGTTACCC GTGAACCGTC GCCGGAACAA 

451 ATCAAACAGG CGGCATCTTA TACGGGCATA GACAAAATCA TTTTGCAACA 

501 AGGCAAAGAT TACGCTTCCT TGAGCAAAAC CCACCCCAAA GCCTATTTGG 

551 ATTTATCTTC GATTGCCAAA GGCTTCGGCG TTGATAAAGT TGCGGGCGAA 

601 CTGGAAAAAT ACGGCATTCA AAATTATCTG GTCGAAAtcg gcggcGAGTT 

651 GCACGGCAAA GGCAAAAATG CGGACGGCGA ACCGTGGCGC ATCGGTATAG 

701 AGCAACCCAA TATCATCCAA GgcgGCAata CGCAGATTAt cgtcccgctg 

751 aaCaaccgtt cgctTGCCAC TTCCGGCGAT TAccgtaTTT tccacgtcgA 

801 TAAAAAcggc aaacgccttt cccacaTCAT CAATCCCaAC aacAAACgac 

851 ccATCAGcca caacctcgcc tccatcagcg tggtctcAGA CAGTGCAATG 

901 ACGGCGGACG GTTtatCCAC AGGATTATTT GTTTTAGGCG AAACCGAAGC 

951 CTTAAGGCTG GCAGAACAAG AAAAACTCGC TGTTTTCCTA ATTGTCCGGG 

1001 ATAAGGACGG CTACGGCACC GCCATGTCTT CCGAATTTGC CAAGCTGCTC 

1051 CGCTAA 



This encodes a protein having amino acid sequence [<SEQ ID 450>] fSEO ID NO: 450) : 



1 MPSETRLPNL IRALIFALGF IFLNA CSEQT AQTVTLQGET MGTTYTVKYL 

51 SNNRDKLPSP AKIQKRIDDA LKEVNRQMST YQTDSEISRF NQHTAGKPLR 

101 ISSDFAHVTA EAVRLNRLTH GALDVTVGPL VNLWGFGPDK SVTREPSPEQ ■ 

151 IKQAASYTGI DKIILQQGKD YASLSKTHPK AYLDLSSIAK GFGVDKVAGE 

201 LEKYGIQNYL VEIGGELHGK GKNAHGEPWR IGIEQPNIIQ GGNTQ I I VP L 

251 NNRSLATSGD YRIFHVDKNG KRLSHIINPN NKRPISHNLA SISWSDSAM 

301 TADGLSTGLF VLGETEALRL AEQEKLAVFL IVRDKDGYRT AMSSEFAKLL 



This protein shosw homology with a hypothetical lipoprotein precursor (SEP ID N O: 1141) from 
H.influenzae: 
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Sp | P44 550 | YOJL__HAEIN HYPOTHETICAL LIPOPROTEIN HI0172 PRECURSOR ) gi | 1074292 | pir | 4 
hypothetical protein HI0172 - Haemophilus influenzae {strain Rd KW20) ) gi | 1573128 
(U32702) hypothetical [Haemophilus influenzae] Length = 346 
Score = 353 bits (896), Expect = 9e-97 

Identities = 181/344 (52%), Positives = 247/344 (71%), Gaps = 4/344 (1%) 

Query: 7 LPNL IRALI FALGF I FLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSPAKIQKR 66 

+ LI +1 + L AC ++T + ++L G+TMGTTY VKYL + S K + 

Sbjct: 1 MKKLISGIIAVAMALSLAACQKET-KVISLSGKTMGTTYHVKYLDDGSITATSE-KTHEE 58 

Query: 67 I DDALKEVNRQMS TYQTDSE I SRFNQHT - AGKPLR I SSDFAHVTAE AVRLNRLTHGALDV 125 

1+ LK+VN +MSTY+ DSE+SRFNQ+T P+ IS+DFA V AEA+RLN++T GALDV 
Sbjct: 59 IEAILKDVNAKMSTYKKDSELSRFNQNTQVNTPIEISADFAKVLAEAIRLNKVTEGALDV 118 

Query: 126 TVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILQQGKDYASLSKTHPKAYLDL 185 

TVGP + VNLWGFGP + K ++P+PEQ+ + + + GIDKI L K+ A+LSK P+ Y+DL 
Sbjct: 119 TVGPWNLWGFGPEKRPEKQPTPEQLAERQAWVGIDKITLDTNKEKATLSKALPQVYVDL 178 

Query: 186 SSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNAHGEPWRIGIEQPNIIQGGNTQ 245 
SSIAKGFGVD+VA +LE+ QNY+VEIGGE+ KGKN G+PW+I IE+P + 

Sbjct: 179 S S I AKGFGVDQVAEKLEQLNAQNYMVE I GGE I RAKGKN I EGKPWQ I AI EKPTTTGERAVE 238 

Query: 246 I I VPLNNRSLATSGDYRI FHVDKNGKRLSH I INPNNKRP ISHNLAS I SWSDSAMTADGL 305 

++ LNN +A+SGDYRI+ ++NGKR +H I+P PI H+LASI+V++ ++MTADGL 

Sbjct: 239 AVIGLNNMGMASSGDYRIY-FEENGKRFAHEIDPKTGYPIQHHLASITVLAPTSMTADGL 297 

Query: 306 STGLFVLGETEALRLAEQEKLAVFLIVRDKDGYRTAMSSEFAKL 349 

STGLFVLGE +AL +AE+ LAV+LI+R +G+ T SS F KL 
Sbjct: 298 S TGL F VLGED KALE VAE KNNLA VYL 1 1 RTDNG F VTKS S S AF KKL 341 

Based on this analysis, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 54 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 45 1>] (SEP ID 
NO: 451) : 



1 . . CCGTGCCGCC GACAGGGCGA CGACGTGTAT GCGGCGCACG CGTCCCGTCA 

51 AAAATTGTGG CTGCGCTTCA TCGGCGGCCG GTCGCATCAA AATATACGGG 

101 GCGGCGCGGC TGCGGACGGG TGGCGCAAAG GCGTGCAAAT CGGCGGCGAG 

151 GTGTTTGTAC GGCAAAATGA AGGCAGCCkA yTGGCAATCG GCGTGATGGG 

201 CGGCAGGGCC GGCCAGCACG CwTCAGTCAA CGGCAAAGGC GGTGCGGCAG 

2 51 gCAGTGATTT GTATGGTTAT GgCGGGGgTG TTTATGCTgC GTGGCATCAG 
301 TTGCGCGATA AACAAACGGG TgCGTATTTG GACGGCTGGT TGCAATACCA 

3 51 ACGTTTCAAA CACCGCATCA ATGATGAAAA CCGTGCGGAA CgCTACAAAA 

4 01 CCAAAGGTTG GACGGCTTCT GTCGAAGGCG GCTACAACGC GCTTGTGGCG 
451 GAAGGCATTG TCGGAAAAGG CAATAATGTG CGGTTTTACC TACAACCGCA 
501 GgCGCAGTTT ACCTACTTGG GCGTAAACGG CGGCTTTACC GACAGCGAGG 
551 GGACGGCGGT CGGACTGCTC GGCAGCGGTC AGTGGCAAAG CCGCGCCGGC 
601 AtTCGGGCAA AAACCCGTTT. TGCTTTGCGT AACGGTGTCA ATCTTCAGCC 
651 TTTTGCCGCT TTTAATGTtt TGCACAGGTC AAAATCTTTC GGCGTGGAAA 
701 TGGACGGCGA AAAACAGACG CTGGCAGGCA GGACGGCACT CGAAGGGCGG 
751 TTCGGTATTG AAGCCGGTTG GAAAGGCCAT ATGTCCGCA. . 
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10 



This corresponds to the amino acid sequence [<SEQ ID 452; ORF35>] (SEP ID NO: 452; 
PRF35) : 

1 . . PCRRQGDDVY AAHASRQKLW LRFIGGRSHQ NIRGGAAADG WRKGVQIGGE 

51 VFVRQNEGSX LAIGVMGGRA GQHASVNGKG GAAGSDLYGY GGGVYAAWHQ 

101 LRDKQTGAYL DGWLQYQRFK HRINDENRAE RYKTKGWTAS VEGGYNALVA 

151 EGIVGKGNNV RFYLQPQAQF TYLGVNGGFT DSEGTAVGLL GSGQWQSRAG 

2 01 IRAKTRFALR NGVNLQPFAA FNVLHRSKSF GVEMDGEKQT LAGRTALEGR 

2 51 FGIEAGWKGH MSA. . 

Computer analysis of this amino acid sequence gave the following results: 

Homology with putative secreted VirG-homolgue of N. meningitidis (accession number A32247) 

ORF (SEP ID NO: 452) and virg-h protein (SEP ID NO: 1 146) show 51% aa identity in 261aa 
overlap: 

15 Orf3 5 5 QGDD VYAAHASRQKLWLRF I GGRSHQNI RGGAA - ADGWRKGVQ I GGE VFVRQNEGSXLAI 63 

+ D++ R+ LWLR I G S+Q ++G A +G+RKGVQ+GGEVF QNE + L+I 

virg-h 396 KNSDI FDRTLPRKGLWLRVIDGHSNQWVQGKTAPVEGYRKGVQLGGEVFTWQNESNQLS I 455 

Orf35 64 GVMGGRAGQHASVNGKG- - GAAGSDLYGYGGGVYAAWHQLRDKQTGAYLDGWLQYQRFKH 121 
G+MGG+A Q ++ + ++ G+G GVYA WHQL+DKQTGAY D W+QYQRF+H 

20 virg-h 4 56 GLMGGQAEQRSTFHNPDTDNLTTGNVKGFGAGVYATWHQLQDKQTGAYADSWMQYQRFRH 515 

Orf35 122 RINDENRAERYKTKGWTASVEGGYNALVAEGIVGKGNNVRFYLQPQAQFTYLGVNGGFTD 181 

RIN E+ ER+ +KG TAS+E GYNAL+AE KGN++R YLQPQAQ TYLGVNG F+D 
virg-h 516 RINTEDGTERFTSKGITASIEAGYNALLAEHFTKKGNSLRVYLQPQAQLTYLGVNGKFSD 575 

Orf35 182 SEGTAVGLLGSGQWQSRAGIRAKTRFALRNGVNLQPFAAFNVLHRSKSFGVEMDGEKQTL 241 
25 SE V LLGS Q Q+R G++AK + F+L + ++PFAA N L+ +K FGVEMDGE++ + 

virg-h 576 SENAHVNLLGSRQLQTRVGVQAKAQFSLYKNIAIEPFAAVNALYHNKPFGVEMDGERRVI 635 

Orf35 242 AGRTALEGRFG I EAGWKGHMS 262 

+TA+E + G+ K H++ 
virg-h 636 NNKTAI ESQLGVAVKI KSHLT 656 

30 Homology with a predicted PRF from N. meningitidis (strain A) 

PRF35 (SEP ID NP: 452) shows 96.9% identity over a 259aa overlap with an PRF (PRF35a) 
(SEP ID NP: 454) from strain A of N. meningitidis: 

10 20 30 

orf 35 . pep PCRRQGDDVYAAHASRQKLWLRFIGGRSHQNIRG 
35 : | | | | | | | | | | | | | | | | | | | | | | | | | | | 

orf 3 5a QRLAI PEAEAVLYAQQAYAANTLFGLRAADRGDDVYAADPSRQKLWLRFIGGRSHQNIRG 

310 320 330 340 350 360 

40 50 60 70 80 90 
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or f 3 5 . pep GAAADGWRKGVQIGGEVFVRQNEGSXLAIGVMGGRAGQHASVNGKGGAAGSDLYGYGGGV 

Mllll llllllllllllllllll lllllllllllllllllllilllll hllllll 
O r f 3 5 a GAAADGRRKGVQ I GGE VFVRQNEGSRLA I GVMGGRAGQHAS VNGKGGAAGS YLHG YGGGV 

370 380 390 400 410 420 

100 110 120 130 140 150 

orf 3 5 . pep YAAWHQLRDKQTGAYLDGWLQYQRFKHRINDENRAERYKTKGWTASVEGGYNALVAEGIV 

1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ; I I I I I I I I I I I I I I h I 

orf 3 5a YAAWHQLRDKQTGAYLDGWLQYQRFKHRINDENRAERYKTKGWTASVEGGYNALVAEGW 
430 440 450 460 470 480 

160 170 180 190 200 210 

orf 3 5 . pep GKGNNVRFYLQPQAQFTYLGVNGGFTDSEGTAVGLLGSGQWQSRAGIRAKTRFALRNGVN 

MMMMMMM MM MMMMMMMM MM MMMMMMMMI 

orf 35a GKGNNVRFYLQPQAQFTYLGVNGGFTDSEGTAVGLLGSGQWQSRAGIRAKTRFALRNGVN 
490 500 510 520 . 530 540 

220 230 240 250 260 

or f 3 5 . pep LQPFAAFNVLHRSKS FGVEMDGEKQTLAGRTALEGRFGI EAGWKGHMSA 

lllllllllllllllllllllllllllllllllllllllllllllllll 
orf 3 5a LQPFAAFNVLHRSKS FGVEMDGEKQTLAGRTALEGRFGI EAGWKGHMSARIGYGKRTDGD 

550 560 570 580 590 600 

orf 3 5a KEAALSLKWLFX 
610 , 620 

The complete length ORF35a nucleotide sequence [<SEQ ID 453>] (SEP ID NO: 453) is 

1 ATGTTCAGAG CTCAGCTTGG TTCAAATACT CGTTCTACCA AAATCGGCGA 

51 CGATGCCGAT TTTTCATTTT CAGACAAGCC GAAACCCGGC ACTTCCCATT 

101 ATTTTTCCAG CGGTAAAACC GATCAAAATT CATCCGAATA TGGGTATGAC 

151 GAAATCAATA TCCAAGGTAA AAACTACAAT AGCGGCATAC TCGCCGTCGA 

2 01 TAATATGCCC GTTGTTAAGA AATATATTAC AGATACTTAC GGGGATAATT 
251 TAAAGGATGC GGTTAAGAAG CAATTACAGG ATTTATACAA AACAAGACCC 
301 GAAGCTTGGG AAGAAAATAA AAAACGGACT GAGGAGGCGT ATATAGAACA 

3 51 GCTTGGACCA AAATTTAGTA TACTCAAACA GAAAAACCCC GATTTAATTA 

4 01 ATAAATTGGT AGAAGATTCC GTACTCACTC CTCATAGTAA TACATCACAG 
4 51 ACTAGTCTCA ACAACATCTT CAATAAAAAA TTACACGTCA AAATCGAAAA 
501 CAAATCCCAC GTCGCCGGAC AGGTGTTGGA ACTGACCAAG ATGACGCTGA 
551 AAGATTCCCT TTGGGAACCG CGCGGCCATT CCGACATCCA TATGCTGGAA 
601 ACTTCCGATA ATGCCCGCAT CCGCCTGAAC ACGAAAGATG AAAAACTGAC 
651 CGTCCATAAA GCGTATCAGG GCGGTGCGGA TTTCCTGTTC GGCTACGACG 
701 TGCGGGAGTC GGACAAACCC GCCCTGACCT TTGAAGAAAA AGTCAGCGGA 
751 CAATCCGGCG TGGTTTTGGA ACGCCGGCCG GAAAATCTGA AAACGCTCGA 
801 CGGGCGCAAA CTGATTGCGG CGGAAAAGGC AGACTCTAAT TCGTTTGCGT 
851 TTAAACAAAA TTACCGGCAG GGACTGTACG AATTATTGCT CAAGGAATGC 
901 GAAGGCGGAT TTTGCTTGGG CGTGCAGCGT TTGGCTATCC CCGAGGCGGA 
951 AGCGGTTTTA TATGCCCAAC AGGCTTATGC GGCAAATACT TTGTTCGGGC 

1001 TGCGTGCCGC CGACAGGGGC GACGACGTGT ATGCCGCCGA TCCGTCCCGT 

1051 CAAAAATTGT GGCTGCGCTT CATCGGCGGC CGGTCGCATC AAAATATACG 

1101 GGGCGGCGCG GCTGCGGACG GGCGGCGCAA AGGCGTGCAA ATCGGCGGCG 

1151 AGGTGTTTGT ACGGCAAAAT GAAGGCAGCC GGCTGGCAAT CGGCGTGATG 

12 01 GGCGGCAGGG CTGGCCAGCA CGCATCAGTC AACGGCAAAG GCGGTGCGGC 
1251 AGGCAGTTAT TTGCATGGTT ATGGCGGGGG TGTTTATGCT GCGTGGCATC 

13 01 AGTTGCGCGA TAAACAAACG GGTGCGTATT TGGACGGCTG GTTGCAATAC 
1351 CAACGTTTCA AACACCGCAT CAATGATGAA AACCGTGCGG AACGCTACAA 

14 01 AACCAAAGGT TGGACGGCTT CTGTCGAAGG CGGCTACAAC GCGCTTGTGG 
1451 CGGAAGGCGT TGTCGGAAAA GGCAATAATG TGCGGTTTTA CCTGCAACCG 
1501 CAGGCGCAGT TTACCTACTT GGGCGTAAAC GGCGGCTTTA CCGACAGCGA 
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1551 GGGGACGGCG GTCGGACTGC TCGGCAGCGG TCAGTGGCAA AGCCGCGCCG 
1601 GCATTCGGGC AAAAACCCGT TTTGCTTTGC GTAACGGTGT CAATCTTCAG 
1651 CCTTTTGCCG CTTTTAATGT TTTGCACAGG TCAAAATCTT TCGGCGTGGA 
1701 AATGGACGGC GAAAAACAGA CGCTGGCAGG CAGGACGGCG CTCGAAGGGC 
1751 GGTTCGGCAT TGAAGCCGGT TGGAAAGGCC ATATGTCCGC ACGCATCGGA 
1801 TACGGCAAAA GGACGGACGG CGACAAAGAA GCCGCATTGT CGCTCAAATG 
1851 GCTGTTTTGA 

This encodes a protein having amino acid sequence [<SEQ ID 454>] (SEP ID NO: 454) : 



10 1 MFRAQLGSNT RSTKIGDDAD FSFSDKPKPG TSHYFSSGKT DQNSSEYGYD 

51 EINIQGKNYN SGILAVDNMP WKKYITDTY GDNLKDAVKK QLQDLYKTRP 

101 EAWEENKKRT EEAYIEQLGP KFSILKQKNP DLINKLVEDS VLTPHSNTSQ 

151 TSLNNIFNKK LHVKIENKSH VAGQVLELTK MTLKDSLWEP RRHSDIHMLE 

201 TSDNARIRLN TKDEKLTVHK AYQGGADFLF GYDVRESDKP ALTFEEKVSG 

15 2 51 QSGWLERRP ENLKTLDGRK LIAAEKADSN SFAFKQNYRQ GLYELLLKQC 

301 EGGFCLGVQR LAIPEAEAVL YAQQAYAANT LFGLRAADRG DDVYAADPSR 

3 51 QKLWLRFIGG RSHQNIRGGA AADGRRKGVQ IGGEVFVRQN EGSRLAIGVM 

4 01 GGRAGQHASV NGKGGAAGSY LHGYGGGVYA AWHQLRDKQT GAYLDGWLQY 
4 51 QRFKHRINDE NRAERYKTKG WTASVEGGYN ALVAEGWGK GNNVRFYLQP 

20 501 QAQFTYLGVN GGFTDSEGTA VGLLGSGQWQ SRAGIRAKTR FALRNGVNLQ 

551 PFAAFNVLHR SKSFGVEMDG EKQTLAGRTA LEGRFGIEAG WKGHMSARIG 

601 YGKRTDGDKE AALSLKWLF* 

Homology with a predicted ORF from N. gonorrhoeae 

25 ORF35 (SEP ID NO: 452) shows 51.7% identity over a 261aa overlap with a predicted PRF 
(PRF35ngh) (SEP ID NP: 456) from N. gonorrhoeae: 

orf 35 .pep PCRRQGDDVYAAHASRQKLWLRFIGGRSHQNIRG 34 

■ = = :|:: h I II I I hhl -I 

orf3 5ngh FTKVQERDDIAIYAQQAQAANTLFALRLNDKNSDIFDRTLPRKGLWLRVIDGHSNQWVQG 370 

30 orf 35 .pep GAA - ADGWRKGVQ I GGEVF VRQNEGSXLAI GVMGGRAGQHAS VNGKG - - GAAGSDLYGYG 91 

:| -hllllhllllh llh: hlhllhl h- = = = hi 

orf 3 5ngh KTAPVEGYRKGVQLGGEVFTWQNESNQLS IGLMGGQAEQRSTFRNPDTDNLTTGNVKGFG 430 

or f 3 5 . pep GGVYAAWHQLRDKQTGAYLDGWLQYQRFKHRINDENRAERYKTKGWTASVEGGYNALVAE 151 

h I I hi I h h I I I I I h Ihh I I I hh I I :|h h llhhilllhll 
35 or f 3 5ngh AGVYATWHQLQDKQTGAYVDS WMQYQRFRHRINTE YATERFTS KG I TAS I EAGYNALLAE 4 90 

orf 35. pep GIVGKGNNVRFYLQPQAQFTYLGVNGGFTDSEGTAVGLLGSGQWQSRAGIRAKTRFALRN 211 

:: HI- M I I I I h I I I I I I I hllh: hill I I I I = I = = I I = = I 1 = I 
or f 3 5ngh HFTKKGNSLRVYLQPQAQLTYLGVNGKFSDSENAQVNLLGSRQLQSRVGVQAKAQFAFTN 550 

orf 35 . pep GVNLQPFAAFNVLHRSKSFGVEMDGEKQTLAGRTALEGRFGIEAGWKGHMSA 263 

40 ||::||hl I ::::| lllhlh:::: ::hh -h I |:|:: 

orf 35ngh GVTFQPFVAVNS I YQQKPFGVE I DGDRRVINNKTVI ETQLGVAAKI KSHLTLQAS FNRQT 610 

A partial PRF35ngh nucleotide sequence [<SEQ ID 455>] (SEP ID NP: 455) is predicted to 
encode a protein having partial amino acid sequence [<SEQ ID 456>] (SEP ID NP: 456) : 



45 



i 

51 



. KKLRDRNSEY WKEETYHIKS NGRTYPNIPA LFPKHPFDPF ENINNSKKIS 
FYDKEYTEDY LVGFARGFGV EKRNGEEEKP LRQYFKDCVN TENSNNDNCK 
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101 ISSFGNYGPI LIKSDIFALA SQIKNSHINS EILSVGNYIE WLRPTLNKLT 

151 GWQEHLYAGL DPFHYIEVTD NSHVIGQTID LGALELTNSL WKPRWNSNID 

201 YLITKNAEIR FNTKNESLLV KEDYAGGARF RFAYDLKDKV PEIPVLTFEK 

251 NITGTSDI IF EGKALDNLKH LDGHQIVKVN DTADKDAFRL SSKYRKGIYT 

3 01 LSLQQRPEGF FTKVQERDDI AIYAQQAQAA NTLFALRLND KNSDIFDRTL 
351 PRKGLWLRVI DGHSNQWVQG KTAPVEGYRK GVQLGGEVFT WQNESNQLS I 

4 01 GLMGGQAEQR STFRNPDTDN LTTGNVKGFG AGVYATWHQL QDKQTGAYVD 
451 SWMQYQRFRH RINTEYATER FTSKGITASI EAGYNALLAE HFTKKGNSLR 
501 VYLQPQAQLT YLGVNGKFSD SENAQVNLLG SRQLQSRVGV QAKAQFAFTN 
551 GVTFQPFVAV NSIYQQKPFG VEIDGDRRVI NNKTVIETQL GVAAKIKSHL 
601 TLQASFNRQT SKHHHAKQGA LNLQWTF* 

Based on this prediction, these proteins from N. meningitidis and N. gonorrhoeae, and their 
epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 55 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 457>] (SEP ID 
NO: 457) : 



. GCGGAATATG TTCAGTTCTC TATAGATTTG TTCAGTGTGG GTAAATCGGG 

GGGCGGTATA CCTAAGGCTA AGCCTGTGTT TGATGCGAAA CCGAGATGGG 

AGGTTGATAG GAAGCTTAAT AAATTGACAA CTCGTGAGCA GGTGGAGAAA 

AATGTTCAGG AAACGAGAAG AAGGAGTCAG AGTAGTCAGT TTAAAGCCCA 

TGCGCAACGA GAATGGGAAA ATAAAACAGG GTTAGATTTT AATCATTTTA 

TAGGTGGTGA TATCAATAAA AAAGGCACAG TAACAGGAGG GCATAGTCTA 

ACCCGTGGTG ATGTACGGGT GATACAACAA ACCTCGGCAC CTGATAAACA 

TGGGGT.TTA TCAAGCGACA GTGGAAATTN A 

This corresponds to the amino acid sequence [<SEQ ID 458; ORF46>] (SEP ID NO: 458; 
PRF46) : 



i 

51 
101 
151 
201 
251 
301 
351 



1 . . AEYVQFSIDL FSVGKSGGGI PKAKPVFDAK PRWEVDRKLN KLTTREQVEK 
51 NVQETRRRSQ SSQFKAHAQR EWENKTGLDF NHFIGGDINK KGTVTGGHSL 
101 TRGDVRVIQQ TSAPDKHGXL SSDSGNX 

Further work revealed further partial nucleotide sequence [<SEQ ID 459>] (SEP ID NP: 459) : 



1 . . GCAGTGTGCC TnCCGATGCA TGCACACGCC TCAnATTTGG CAAACGATTC 

51 TTTTATCCGG CAGGTTCTCG ACCGTCAGCA TTTCGAACCC GACGGGAAAT 

101 ACCACCTATT CGGCAGCAGG GGGGAACTTG CCGAGCGCCA GTCTCATATC 

151 GGATTGGGAA AAATACAAAG CCATCAGTTG GGCAACCTGA TGATTCAACA 

201 GGCGGCCATT AAAGGAAATA TCGGCTACAT TGTCCGCTTT TCCGATCACG 

251 GGCACGAAGT CCATTCCCCs TTCGACAACC ATGCCTCACA TTCCGATTCT 

301 GATGAAGCCG GTAGTCCCGT TGACGGATTT AGCCTTTACC GCATCCATTG 

351 GGACGGATAC GAACACCATC CCGCCGACGG CTATGACGGG CCACAGGGCG 

401 GCGGCTATCC CGCTCCCAAA GGCGCGAGGG ATATATACAG TTACGACATA 

451 AAAGGCGTTG CCCAAAATAT CCGCCTCAAC CTGACCGACA ACCGCAGCAC 

501 CGGACAACGG CTTGCCGACC GTTTCCACAA TGCCGGTAGT ATGCTGACGC 

551 AAGGAGTAGG CGACGGATTC AAACGCGCCA CCCGATACAG CCCCGAGCTG 

601 GACAGATCGG GCAATGCCGC CGAAGCCTTC AACGGCACTG CAGATATCGT 

651 TAAAAACATC ATCGGCGCTG CAGGAGAAAT TGT 
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This corresponds to the amino acid sequence [<SEQ ID 460; ORF46-l>] (SEP ID NO: 460; 
PRF46-1) : 

1 . . AVCLPMHAHA SXLANDSFIR QVLDRQHFEP DGKYHLFGSR GELAERQSHI 

5 51 GLGKIQSHQL GNLMIQQAAI KGNIGYIVRF SDHGHEVHSP FDNHASHSDS 

101 DEAGS PVDGF SLYRIHWDGY EHHPADGYDG PQGGGYPAPK GARDIYSYDI 

151 KGVAQNIRLN LTDNRSTGQR LADRFHNAGS MLTQGVGDGF KRATRYSPEL 

201 DRSGNAAEAF NGTADIVKNI IGAAGEI 

10 Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. gonorrhoeae 

ORF46 (SEP ID NO: 458) shows 98.2% identity over a lllaa overlap with a predicted ORF 
(ORF46ng) (SEP ID NO: 462) from N. gonorrhoeae: 

orf 46 . pep AEYVQFS IDLFSVGKSGGGI PKAKPVFDAKPRWEVDRKLNKLTTR 45 

15 I M : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

orf 4 6ng P KTGVPFDGKGFPNFEKHVKYDTKLD I QELSGGG I PKAKPVFDAKPRWEVDRKLNKLTTR 217 

orf 46 .pep EQVEKNVQETRRRSQSSQFKAHAQREWENKTGLDFNHFIGGDINKKGTVTGGHSLTRGDV 105 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I h I I I I I I I I I I I I 
orf46ng EQVEKNVQETRRRSQSSQFKAHAQREWENKTGLDFNHFIGGDINKKGAVTGGHSLTRGDV 277 

20 orf 46. pep RVI QQTS APDKHGXLS SDSGN 126 

Illllllllllll Mill 

orf46ng RVIQQTSAPDKHGVLSSDSGN 298 

A partial PRF46ng nucleotide sequence [<SEQ ID 46 1>] (SEP IDNP: 461) is predicted to encode 
25 a protein having partial amino acid sequence [<SEQ ID 462>] (SEP ID NP: 462) : 

1 . .RRLKHCCHAR LGSAFHRKQD GAHQRFGRYG ATQRLCRSSH PRLGSPKPQC 

51 RTRHRSRQQY LYGSHPHQRD WSCPGKIQLG RHHGTSCRAV ADXRDRICER 

101 EIRRQRQXCR CRLGKIPSLS IPKYPLKLEQ RYGKENITSS TVPPSNGKNV 

151 KLADQRHPKT GVPFDGKGFP NFEKHVKYDT KLDIQELSGG GIPKAKPVFD 

30. 201 AKPRWEVDRK LNKLTTREQV EKNVQETRRR SQSSQFKAHA QREWENKTGL 

251 DFNHFIGGDI NKKGAVTGGH SLTRGDVRVI QQTSAPDKHG VLSSDSGN* 

Further work revealed the complete gonococcal DNA sequence [<SEQ ID 463>] (SEP ED NP: 
463) : 

35 1 TTGGGCATTT CCCGCAAAAT ATCCCTTATT CTGTCCATAC TGGCAGTGTG 

51 CCTGCCGATG CATGCACACG CCTCAGATTT GGcaAACGAT CCCTTTATCC 

101 GgCaggttcT CGaccGTCAG CATTTCGaac ccgacggGAa ATACCaCCTA 

151 TTcggCaGCA GGGGGGAGCT TgccnagcGC aacggccATa tcggattggG 

201 aaacaTAcaa Agccatcagt tGggccacct gatgattcaa caggcggccg 

40 251 ttgaaggaaA TAtcgGctac attgtccgct tttccgatca cgggcacaaa 

301 ttccattcgc ccttcGAcaa ccaTGCCTCA CATTCCGATT CTGACGAAGC 
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351 CGGTAGTCCC GTTGACGGAT TCAGCCTTTA CCGCATCCAT TGGGACGGAT 

4 01 ACGAACACCA TCCCGCCGAC GGCTATGACG GGCCACAGGG CGGCGGCTAT 

451 CCCGCTCCCA AAGGCGCGAG GGATATATAC AGCTACGACA TAAAAGGCGT 

501 TGCCCAAAAT ATCCGCCTCA ACCTGACCGA CAACCGCAGC ACCGGACAAC 

551 GGCTTGCCGA CCGTTTCCAC AATGCCGGCG CTATGCTGAC GCAAGGAGTA 

601 GGCGACGGAT TCAAACGCGC CACCCGATAC AGCCCCGAGC TGGACAGATC 

651 GGGCAATGCc gccGAAGCCT TCAACGGCAC TGCAGATATC GTCAAAAACA 

701 TCATCGGCGC GGCAGGAGAA ATTGTCGGCG CAGGCGATGC CGTGCagGGT 

751 ATAAGCGAAG GCTCAAACAT TGCTGTCATG CACGGCTTGG GTCTGCTTTC 

801 CACCGAAAAC AAGATGGCGC GCATCAACGA TTTGGCAGAT ATGGCGCAAC 

851 TCAAAGACTA TGCCGCAGCA GCCATCCGCG ATTGGGCAGT CCAAAACCCC 

901 AATGCCGCAC AAGGCATAGA AGCCGTCAGC AATATCTTTA TGGCAGCCAT 

951 CCCCATCAAA GGGATTGGAG CTGTCCGGGG AAAATACGGC TTGGGCGGCA 

1001 TCACGGCACA TCCTGTCAAG CGGTCGCAGA TGGGCGCGAT CGCATTGCCG 

1051 AAAGGGAAAT CCGCCGTCAG CGACAATTTT GCCGATGCGG CATACGCCAA 

1101 ATACCCGTCC CCTTACCATT CCCGAAATAT CCGTTCAAAC TTGGAGCAGC 

1151 GTTACGGCAA AGAAAACATC ACCTCCTCAA CCGTGCCGCC GTCAAACGGC 

12 01 AAAAATGTCA AACTGGCAGA CCAACGCCAC CCGAAGACAG GCGTACCGTT 
1251 TGACGGTAAA GGGTTTCCGA ATTTTGAGAA GCACGTGAAA TATGATACGA 

13 01 AGCTCGATAT TCAAGAATTA TCGGGGGGCG GTATACCTAA GGCTAAGCCT 
1351 GTGTTTGATG CGAAACCGAG ATGGGAGGTT GATAGGAAGC TTAATAAATT 

14 01 GACAACTCGT GAGCAGGTGG AGAAAAATGT TCAGGAAACG AGAAGAAGGA 
14 51 GTCAGAGTAG TCAGTTTAAA GCCCATGCGC AACGAGAATG GGAAAATAAA 
1501 ACAGGGTTAG ATTTTAATCA TTTTATAGGT GGTGATATCA ATAAGAAAGG 
1551 CACAGTAACA GGAGGGCATA GTCTAACCCG TGGTGATGTA CGGGTGATAC 
1601 AACAAACCTC GGCACCTGAT AAACATGGGG TTTATCAAGC GACAGTGGAA 
1651 ATTAAAAAGC CTGATGGAAG TTGGGAGGTG AAAACGAAAA AAGGTGGGAA 
1701 AGTGATGACC AAGCACACCA TGTTCCCAAA AGATTGGGAT GAGGCTAGAA 
1751 TTAGGGCTGA AGTTACTTCG GCTTGGGAAA GTAGAATAAT GCTTAAGGAT 
1801 AATAAATGGC AGGGTACAAG TAAATCGGGT ATTAAAATAG AAGGATTTAC 
1851 CGAACCTAAT AGAACAGCAT ATCCCATTTA TGAATAG 

This corresponds to the amino acid sequence [<SEQ ID 464; ORF46ng-l>] (SEP ID NO: 464; 
PRF46ng-l) : 

1 LGISRKISLI LSILAVCLPM HAHA SDLAND PFIRQVLDRQ HFEPDGKYHL 

51 FGSRGELAXR NGHIGLGNIQ SHQLGHLMIQ QAAVEGNIGY IVRFSDHGHK 

101 FHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD GYDGPQGGGY 

151 PAPKGARDIY SYDIKGVAQN IRLNLTDNRS TGQRLADRFH NAGAMLTQGV 

201 GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNI IGAAGE IVGAGDAVQG 

251 ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA AIRDWAVQNP 

3 01 NAAQGIEAVS NIFMAAIPIK GIGAVRGKYG LGGITAHPVK RSQMGAIALP 

3 51 KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI TSSTVPPSNG 

4 01 KNVKLADQRH PKTGVPFDGK GFPNFEKHVK YDTKLDIQEL SGGGIPKAKP 
4 51 VFDAKPRWEV DRKLNKLTTR EQVEKNVQET RRRSQSSQFK AHAQREWENK 
501 TGLDFNHFIG GDINKKGTVT GGHSLTRGDV RVIQQTSAPD KHGVYQATVE 
551 IKKPDGSWEV KTKKGGKVMT KHTMFPKDWD EARIRAEVTS AWESRIMLKD 
601 NKWQGTSKSG IKIEGFTEPN RTAYPIYE* 

ORF46ng-l (SEP ID NO: 464) and ORF46-1 (SEP ID NO: 460) show 94.7% identity in 227 aa 
overlap: 

10 20 30 40 

orf46-l .pep AVCLPMHAHASXLANDSFIRQVLDRQHFEPDGKYHLFGSRGELAER 

lllllllllll | | | | I I I I I M I I I I I I I I I II I I I I I I I I I 
orf46ng-l LGISRKISLILSILAVCLPMHAHASDLANDPFIRQVLDRQHFEPDGKYHLFGSRGELAXR 

10 20 30 40 50. 60 
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50 60 70 80 90 100 

orf4 6-l.pep QSHIGLGKIQSHQLGNLMIQQAAIKGNIGYIVRFSDHGHEVHSPFDNHASHSDSDEAGSP 

I I M I : I I I I I I hi I I I M I- Ml I I I I I I I I I h I I I | I I I I I I I I I I I I I I I 
orf4 6ng-l NGHIGLGNIQSHQLGHLMIQQAAVEGNIGYIVRFSDHGHKFHSPFDNHASHSDSDEAGSP 

70 80 90 100 110 120 



10 



110 120 130 140 150 160 

orf 46- 1 . pep VDG FS LYR I HWDG YEHHPADG YDGPQGGG Y P APKGARD I YS YD I KGVAQN I RLNLTDNRS 

1 1 1 1 1 1 II M 1 1 II II , II II I II II I II . II III 1 1 II II I III II I II II 1 1 M 

orf 46ng- 1 ' VDGFSLYR I HWDGYEHHPADGYDGPQGGGYPAPKGARD I YS YD I KGVAQN I RLNLTDNRS 

130 140 150 160 170 180 



15 



170 180 190 200 210 220 

orf 4 6 - 1 . pep TGQRLADRFHNAGSMLTQGVGDGFKRATRYSPELDRSGNAAEAFNGTADI VKNI IGAAGE 

Mlllll llllhllll IIIIIIMIIIIIIIIIIII llllllllll IIIIMM 
orf 4 6ng- 1 TGQRLADRFHNAGAMLTQGVGDGFKRATRYS PELDRSGNAAEAFNGTAD I VKN I IGAAGE 

190 200 210 220 230 240 



20 



orf 46-1 .pep 
orf 46ng-l 



I VGAGDAVQG I SEGSN I AVMHGLGLLSTENKMAR I NDLADMAQLKDYAAAA I RDWAVQNP 
250 260 270 280 290 300 



Homology with a predicted ORF from N.meninsitidis (strain A) 



PRF46ng-l (SEP ID NO: 464) shows 87.4% identity over a 486aa overlap with an ORF (ORF46a) 
(SEP ID NO: 466) from strain A of N. meningitidis: 



10 20 30 40 50 60 

25 orf 4 6a . pep LGI SRKI SL ILS I LAVCLPMHAHASDLANDS F IRQVLDRQHFEPDGKYHLFGSRGELAER 

I I II I I I I I I I I I I M I M M I I I I I M I I I I I I I I I I IM I I I I I I I I I I I I I I 
orf4 6ng-l LGI SRKI SL ILS I LAVCLPMHAHASDLANDPF IRQVLDRQHFEPDGKYHLFGSRGELAXR 

10 20 30 40 50 60 

70 80 90 100 110 120 

30 orf 46a . pep SGH I GLGN I QSHQLGNLF I QQAA I KGN I GY I VRFSDHGHEVHS P FDNHASHSDSDEAGS P 

:||||||||||||||:|:|Mlh:|IIIIMIIIMIh I I I I I I I I I I I I I I I I I I I 
orf46ng-l NGHIGLGNIQSHQLGHLMIQQAAVEGNIGYIVRFSDHGHKFHSPFDNHASHSDSDEAGSP 

70 80 90 100 110 120 

130 140 150 160 170 180 

35 or f 4 6a . pep VDGFSLYR I HWDGYEHHPADGYDGPQGGGYPAPKGARD I YSYD I KGVAQN I RLNLTDNRS 

I I I I I I I I I ' II I I I I I I I I I I I I I : I I I I I I I I I I I I I II : I I I I I I I I M I I I I I M 
orf46ng-l VDGFSLYRI HWDGYEHHPADGYDGPQGGGYPAPKGARD I YSYD I KGVAQN I RLNLTDNRS 

130 140 150 160 170 180 

190 200 210 220 230 240 

40 orf 46a. pep TGQRLVDRFHNTGSMLTQGVGDGFKRATRYS PELDRSGNAAEAFNGTAD I VKN I IGAAGE 

llllhlllli:hllllllllllli Mlllll IMIIII llllllllll llllll 

orf 4 6ng- 1 TGQRLADRFHNAGAMLTQGVGDGFKRATRYS PELDRSGNAAEAFNGTAD I VKNI IGAAGE 

190 200 210 220 230 240 

250 260 270 280 290 300 

45 or f 4 6a . pep IVGAGDAVQGISEGSNIAVMHGLGLLSTENKMARINDLADMAQLKDYAAAAIRDWAVQNP 

I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I i I I I I I I I I I I I I I I I II M I I I I 
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orf 4 6ng-l I VGAGDAVQG I S EGSNI AVMHGLGLLSTENKMAR INDLADMAQLKDYAAAAI RDWAVQNP 

250 260 270 280 290 300 

310 320 330 340 350 360 

orf 46a . pep NAAQGIEAVSNI FTAVI PVKGIGAVRGKYGLGGITAHPVKRSQMGEIALPKGKSAVSDNF 

1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 

orf46ng-l NAAQGI EAVSN I FMAA I P I KG IGAVRGKYGLGG I TAHPVKRSQMGAI ALPKGKSAVSDNF 

310 320 330 340 350 360 

370 380 390 400 410 420 

orf 46a . pep ADAAYAKYPSPYHSRNIRSNLEQRYGKENITSSTVPPSNGPCNVKLANKRHPKTKVPFDGK 

I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I - I i I I I IIIMI 
orf46ng-l ADAAYAKYPSPYHSRNIRSNLEQRYGKENITSSTVPPSNGKNVKLADQRHPKTGVPFDGK 

370 380 390 400 410 420 

430 440 450 460 470 

orf 46a. pep GFPNFEKDVKYDTRINTAVPQVN PIDEPVFN- - PKGSVGSAHSWSITARIQYAKLP 

lllllll Mill::: : ::: | :|||: |: | : ::|:| | | 

orf 4 6ng- 1 GFPNFEKHVKYDTKLD- - IQELSGGGIPKAKPVFDAKPRWEVDRKLN-KLTTREQVEKNV 

430 440 450 460 470 

480 490 500 510 520 530 

orf 4 6a . pep RQGRIRYIPPKNYSPSAPLPKGPNNGYLDKFGNEWTKGPSRTKGQEFEWDVQLSKTGREQ 

- I I 

orf 4 6ng- 1 QETRRRSQSSQFKAHAQREWENKTGLDFNHFIGGDINKKGTVTGGHSLTRGDVRVIQQTS 
480 490 500 510 520 530 

The complete length ORF46a DNA sequence [<SEQ ID 465>] (SEP ID NO: 465) is: 

1 TTGGGCATTT CCCGCAAAAT ATCCCTTATT CTGTCCATAC TGGCAGTGTG 

51 CCTGCCGATG CATGCACACG CCTCAGATTT GGCAAACGAT TCTTTTATCC 

101 GGCAGGTTCT CGACCGTCAG CATTTCGAAC CCGACGGGAA ATACCACCTA 

151 TTCGGCAGCA GGGGGGAACT TGCCGAGCGC AGCGGTCATA TCGGATTGGG 

2 01 AAACATACAA AGCCATCAGT TGGGCAACCT GTTCATCCAG CAGGCGGCCA 
251 TTAAAGGAAA TATCGGCTAC ATTGTCCGCT TTTCCGATCA CGGGCACGAA 
301 GTCCATTCCC CCTTCGACAA CCATGCCTCA CATTCCGATT CTGATGAAGC 

3 51 CGGTAGTCCC GTTGACGGAT TCAGCCTTTA CCGCATCCAT TGGGACGGAT 

4 01 ACGAACACCA TCCCGCCGAC GGCTATGACG GGCCACAGGG CGGCGGCTAT 
4 51 CCCGCTCCCA AAGGCGCGAG GGATATATAC AGCTACGACA TAAAAGGCGT 
501 TGCCCAAAAT ATCCGCCTCA ACCTGACCGA CAACCGCAGC ACCGGACAAC 
551 GGCTTGTCGA CCGTTTCCAC AATACCGGTA GTATGCTGAC GCAAGGAGTA 
601 GGCGACGGAT TCAAACGCGC CACCCGATAC AGCCCCGAGC TGGACAGATC 
651 GGGCAATGCC GCCGAAGCTT TCAACGGCAC TGCAGATATC GTCAAAAACA 
701 TCATCGGCGC GGCAGGAGAA ATTGTCGGCG CAGGCGATGC CGTGCAGGGT 
751 ATAAGCGAAG GCTCAAACAT TGCTGTTATG CACGGCTTGG GTCTGCTTTC 
801 CACCGAAAAC AAGATGGCGC GCATCAACGA TTTGGCAGAT ATGGCGCAAC 
851 TCAAAGACTA TGCCGCAGCA GCCATCCGCG ATTGGGCAGT CCAAAACCCC 
901 AATGCCGCAC AAGGCATAGA AGCCGTCAGC AATATCTTTA CGGCAGTCAT 
951 CCCCGTCAAA GGGATTGGAG CTGTTCGGGG AAAATACGGC TTGGGCGGCA 

1001 TCACGGCACA TCCTGTCAAG CGGTCGCAGA TGGGCGAGAT CGCATTGCCG 

1051 AAAGGGAAAT CCGCCGTCAG CGACAATTTT GCCGATGCGG CATACGCCAA 

1101 ATACCCGTCC CCTTACCATT CCCGAAATAT CCGTTCAAAC TTGGAGCAGC 

1151 GTTACGGCAA AGAAAACATC ACCTCCTCAA CCGTGCCGCC GTCAAACGGA 

12 01 AAGAATGTGA AACTGGCAAA CAAACGCCAC CCGAAGACCA AAGTGCCGTT 
1251 TGACGGTAAA GGGTTTCCGA ATTTTGAAAA AGACGTAAAA TACGATACGA 
1301 GAATTAATAC CGCTGTACCA CAAGTGAATC CTATAGATGA ACCCGTCTTT 

13 51 AATCCTAAAG GTTCTGTCGG ATCGGCTCAT TCTTGGTCTA TAACTGCCAG 

14 01 AATTCAATAC GCAAAATTAC CAAGGCAAGG TAGAATCAGA TATATCCCAC 
14 51 CTAAAAATTA CTCTCCTTCA GCACCGCTAC CAAAAGGACC TAATAATGGA 
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1501 TATTTGGATA AATTTGGTAA TGAATGGACT AAAGGTCCAT CAAGAACTAA 

1551 AGGTCAAGAA TTTGAATGGG ATGTTCAATT GTCTAAAACA GGAAGAGAGC 

1601 AACTTGGATG GGCTAGTAGG GATGGTAAGC ATTTAAATAT ATCAATTGAT 

1651 GGAAAGATTA CACACAAATG A 

This corresponds to the amino acid sequence [<SEQ ID 466>] (SEP ID NO: 466) : 



1 LGISRKISLI LSILAVCLPM HAHA SDLAND SFIRQVLDRQ HFEPDGKYHL 
51 FGSRGELAER SGHIGLGNIQ SHQLGNLFIQ QAAIKGNIGY IVRFSDHGHE 
101 VHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD GYDGPQGGGY 
151 PAPKGARDIY SYDIKGVAQN IRLNLTDNRS TGQRLVDRFH NTGSMLTQGV 
201 GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE IVGAGDAVQG 
251 ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA AIRDWAVQNP 
301 NAAQGIEAVS NIFTAVIPVK GIGAVRGKYG LGGITAHPVK RSQMGEIALP 
351 KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI TSSTVPPSNG 
4 01 KNVKLANKRH PKTKVPFDGK GFPNFEKDVK YDTRINTAVP QVNPIDEPVF 
451 NPKGSVGSAH SWSITARIQY AKLPRQGRIR YIPPKNYSPS APLPKGPNNG 
501 YLDKFGNEWT KGPSRTKGQE FEWDVQLSKT GREQLGWASR DGKHLNISID 
551 GKITHK* 

Based on this analysis, including the presence of a RGD sequence in the gonococcal protein, 
typical of adhesins, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 56 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 467>] (SEP ID 
NO: 467) : 



1 ATGAATATTC ACACCCTGCT CTCCAAACAA TGGACGCTGC CGCCATTCCT 

51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTTGCC CCCAATGCGG 

101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT 

151 TTGGACTATC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTTTCGT 

201 CAAAATTGCC GGCGTATTGG CGTTTTGGCT GGCGGTTTTG TTTGACGGGC 

251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGATCTCAT CGGCGCCATC 

301 AACCTCGTCC CCTTCATCCT GACCGCCCCC GCCCCTTATC AGATAATGAC 

351 CGGGCTG. . . 

This corresponds to the amino acid sequence [<SEQ ID 468; ORF48>] fSEO ID NO: 468; 
ORF48) : 



1 MNIHTLLSKQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN 
51 LDYLPAALLI ALPWRFVKIA GVLAFWLAVL FDGLMMVIQL FPFMDLIGAI 
101 NLVPFILTAP APYQIMTGL. . . 



Further work revealed the complete nucleotide sequence [<SEQ ID 469>] (SEP ID NO: 469) : 



1 ATGAATATTC ACACCCTGCT CTCCAAACAA TGGACGCTGC CGCCATTCCT 
51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTTGCC CCCAATGCGG 
101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT 
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151 TTGGACTATC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTTTCGT 

201 CAAAATTGCC GGCGTATTGG CGTTTTGGCT GGCGGTTTTG TTTGACGGGC 

251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGATCTCAT CGGCGCCATC 

301 AACCTCGTCC CCTTCATCCT GACCGCCCCC GCCCCTTATC AGATAATGAC 

351 CGGGCTGTTG CTGCTGTATA TGCTGGCGAT GCCGTTTGTG TTGCAGAAAG 

4 01 CCGCCGCCAA AACCGACTTC CGGCACATTG CCGTCTGCGC CGCCGTTGTG 

4 51 GCGGCAGCCG GCTATTTCAC CGGCCATTTG AGTTACTACG ACCGGGGTCG 

501 GATGGCCAAT ATCTTCGGCG CAAACAACTT CTACTACGCC AAAAGTCAGG 

551 CGATGCTCTA CACCGTCAGC CAGAATGCCG ACTTTATTAC CGCCGGCCTG 

601 GTCGATCCCG TCTTCCTCCC CTTGGGCAAT CAACAGCGTG CCGCCACGCA 

651 TCTGAACGAG CCGAAATCTC AAAAAATCCT CTTTATCGTC GCCGAATCTT 

701 GGGGGCTGCC GGCCAATCCC GAACTTCAAA ACGCCACTTT TGCCAAACTG 

751 CTGGCGCAAA AAGACCGTTT TTCGGTTTGG GAAAGCGGCA GTTTTCCCTT 

801 CATCGGCGCG ACGGTCGAAG GCGAAATGCG CGAACTGTGT GCCTACGGCG 

851 GTTTGCGCGG GTTCGCACTG CGCCGCGCGC CCGACGAAAA ATTTGCCCGC 

901 TGCCTCCCCA ACCGTTTGAA ACAAGAAGGT TACGCCACCT TTGCGATGCA 

951 CGGCGCGGGC AGTTCGCTTT ACGACCGCTT CAGCTGGTAT CCGAGGGCGG 

1001 GCTTTCAAGA AATCAAAACC GCCGAAAACC TGATCGGTAA AAAAACCTGC 

1051 GCCATTTTCG GCGGCGTGTG CGACAGCGAG CTGTTCGGCG AAGTGTCGGC 

1101 ATTTTTCAAA AAACACGACA AGGGACTGTT TTACTGGATG ACGCTGACCA 

1151 GCCACGCCGA CTATCCCGAA TCCGACATTT TCAACCACAG GCTCAAATGC 

1201 ACCGAATATG GCCTGCCCGC CGAAACCGAC CTCTGCCGCA ATTTCAGCCT 

1251 GCACACCCAA TTCTTCGACC AACTGGCGGA TTTGATCCAA CGCCCCGAAA 

1301 TGAAAGGCAC GGAAGTCATC ATCGTCGGCG ACCATCCGCC GCCCGTCGGC 

1351 AACCTCAATG AAACCTTCCG CTACCTCAAA CAGGGGCACG TCGCCTGGCT 

14 01 GAACTTCAAA ATCAAATAA 

This corresponds to the amino acid sequence [<SEQ ID 470; ORF48-l>] (SEP ID NO: 470; 
ORF48-1) : 



1 MNIHTLLSKQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN 

51 LDYLPAALLI ALPWRFVKIA G VLAFWLAVL FDGLMMVI Q L FPFMDLIGAI 

101 NLVPFILTAP APYQIMTGLL LLYMLAMPFV LQKAAAKTDF RHIAVCAAW 

151 AAAGYFTG HL SYYDRGRMAN I FGANNFYYA KSQAMLYTVS QNADFITAGL 

201 VDPVFLPLGN QQRAATHLNE PKSQKILFIV AESWGLPANP ELQNATFAKL 

251 LAQKDRFSVW ESGSFPFIGA TVEGEMRELC AYGGLRGFAL RRAPDEKFAR 

301 CLPNRLKQEG YATFAMHGAG SSLYDRFSWY PRAGFQEIKT AENLIGKKTC 

351 AIFGGVCDSE LFGEVSAFFK KHDKGLFYWM TLTSHADYPE SDIFNHRLKC 

401 TEYGLPAETD LCRNFSLHTQ FFDQLADLIQ RPEMKGTEVI IVGDHPPPVG 

451 NLNETFRYLK QGHVAWLNFK IK* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF48 (SEP ID NO: 468) shows 94.1% identity over a 119aa overlap with an ORF (ORF48a) 
fSEO ID NO: 472) from strain A of N. meningitidis: 



10 20 30 40 50 60 

orf48 .pep MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI 

I M I I I I I I I I I I I I I I I I I I I I I ! I I llllllllllllllll I II llllllll 
orf48a MNIHTLLSKQWTLPPFLPKRLLLSLLILLXPNAVFWVLALLTATARPIVNLXYLPAALLI 

10 20 30 40 50 60 
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70 80 90 100 110 119 

orf 4 8 . pep ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGL 

Mill III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 i M 1 1 1 1 Ml 1 1 1 1 1 1 1 lllllll 

orf 4 8a ALPWRXVKIXGVLAXWLAVLFDGLMMVIQLFPFMDLIGAINLVPFIXTAPALYQIMTGLL 

70 80 90 100 110 120 

orf 4 8a LLYMl^PFVLQKAAAKTDFRHIAACMVWAAGYFTGHLSXYDRGRMANIFGANNFYYA 

130 140 150 160 170 180 

The complete length ORF48a nucleotide sequence [<SEQ ID 47 1>] (SEP ID NO: 471) is: 



1 0 1 ATGAATATTC ACACCCTGCT CTCCAAACAA TGGACGCTGC CGCCATTCCT 

51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTNNCC CCCAATGCGG 

101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT 

151 TTGGANTACC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTNTCGT 

2 01 CAAAATTGNC GGCGTATTGG CGTNTTGGCT GGCGGTTTTG TTTGACGGGC 
15 251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGATCTCAT CGGCGCCATC 

3 01 AACCTCGTCC CCTTCATCNT GACCGCCCCC GCCCTTTATC AGATAATGAC 
351 CGGGCTGTTA CTGCTGTATA TGCTGGCGAT GCCGTTTGTG TTGCAGAAAG 

4 01 CCGCCGCCAA AACCGACTTC CGACACATTG CCGCCTGTGC CGCCGTTGTG 
4 51 GTGGCAGCCG GCTATTTTAC CGGCCATTTG AGTTANTACG ACCGGGGGCG 

20 501 GATGGCCAAT ATCTTCGGCG CAAACAACTT CTATTACGCC AAAAGTCAGG 

551 CGATGCTCTA CACCGTCAGC CAGAATGCCG ACTTTATTAC CGCCGGCCTG 

601 GTCGATCCCG TCTTCCTCCC CTTGGGCAAT CAACAGCGTG CCGCCACGCA 

651 TCTGAACGAG CCGAAATCTC AAAAAATCCT CTTTATCGTC GCCGAATCTT 

701 GGGGGCTGCC GGCCAATCCC GAACTTCAAA ACGCCACTTT TGCCAAACTG 

25 751 CTGGCGCAAA AAGANCGTTT TTCGGTTTGG GAAAGCGGCA GTTTTCCCTT 

801 CATCGGCGCG ACGATCGAAG GCGAAATGCG CGAACTGTGT GCCTACGGCG 

851 GTTTGCGCGG GTTCGCACTG CGCCGCGCGC CCGACGAAAA ATTTGCCCGC 

901 TGCCTCCCCA ACCGTTTGAA ACAAGAAGGT TACGCCACCT TTGCGATGCA 

951 CGGCGCGGGC AGTTCGCTTT ACGACCGCTT CAGCTGGTAT CCGAGGGCGG 

30 1001 GCTTTCAAGA AATCAAAACC GCCGAAAACC TGATCGGTAA AAAAACCTGC 

1051 GCCATTTTCG GCGGCGTGTG CGACAGCGAG CTGTTCGGCG AAGTGTCGGC 

1101 ANTTTTCAAA AAACACGACA AGGGACTGTT TTACTGGATG ACGCTGACCA 

1151 GCCACGCCGA CTATCCCGAA TCNGACATTT TCAACCACAG GCTCAAATGC 

12 01 ACCGAATATG GCCTGCCCGC CGAAACCGAC NTCTGCCGCA ATTTCAGCCT 
35 12 51 GCACACCCAA TTCTTCGACC AACTGGCGGA TTTGATCCAA CGCCCCGAAA 

1301 TGAAAGGCAC GGAAGTCATC ATCGTCGGCG ACCATCCGCC GCCCGTCGGC 

13 51 AACCTCAATG AAACCTTCCG CTACCTCAAA CAGGGGCACG TCGNCTGGCT 

14 01 GAACTTCAAA ATCAAATAA 

40 This encodes a protein having amino acid sequence [<SEQ ID 472>] fSEO ID NO: 472) : 

1 MNIHTLLSKQ WTLPPFLPKR LLLSLLILLX PNAVFWVLAL LTATARPIVN 

51 LXYLPAALLI ALPWRXVKIX G VLAXWLAVL FDGLMMVI Q L FPFMDLIGAI 

101 NLVPFIXTAP ALYQIMTGLL LLYMLAMPFV LQKAAAKTDF RHIAACAAW 

151 VAAGYFTG HL SXYDRGRMAN I FGANNFY YA KSQAMLYTVS QNADFITAGL 

45 201 VDPVFLPLGN QQRAATHLNE PKSQKILFIV AESWGLPANP ELQNATFAKL 

251 LAQKXRFSVW ESGSFPFIGA TIEGEMRELC AYGGLRGFAL RRAPDEKFAR 

301 CLPNRLKQEG YATFAMHGAG SSLYDRFSWY PRAGFQEIKT AENLIGKKTC 

351 AIFGGVCDSE LFGEVSAXFK KHDKGLFYWM TLTSHADYPE SDIFNHRLKC 

401 TEYGLPAETD XCRNFSLHTQ FFDQLADLIQ RPEMKGTEVI IVGDHPPPVG 

50 451 NLNETFRYLK QGHVXWLNFK IK* 

ORF48a (SEP ID NO: 472) and ORF48-1 (SEP ID NO: 470) show 96.8% identity in 472 aa 



overlap: 
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10 20 30 40 50 60 

orf 4 8a . pep ' MNIHTLLSKQWTLPPFLPKRLLLSLLILLXPNAVFWVLALLTATARPIVNLXYLPAALLI 

1 1 1 1 1 1 ! 1 1 1 1 1 M 1 1 1 1 1 1 ! 1 1 MM I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 IMIIIII 

orf 4 8 - 1 MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPI VNLDYLPAALLI 

5 10 20 30 40 50 60 

70 80 90 100 110 120 

or f 4 8a . pep ALPWRXVKIXGVLAXWLAVLFDGLMMVIQLFPFMDLIGAINLVPFIXTAPALYQIMTGLL 

Mill III 1 1 1 1 MMMMMM MMMMMMMMM I II I IIIIIMI 

orf48-l ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL 
10 70 80 90 100 110 120 

130 140 150 160 170 180 

orf 4 8a . pep LLYMLAMPFVLQKAAAKTDFRHIAACAAVWAAGYFTGHLSXYDRGRMANI FGANNFYYA 

MIMMMMMMMMMMMI IMMMMMM 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 

orf 4 8 - 1 LLYMLAMPFVLQKAAAKTDFRHIAVCAAWAAAGYFTGHLSYYDRGRMAN I FGANNFYYA 

15 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 48a . pep KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATHLNEPKSQKILFIVAESWGLPANP 

1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 M I M 1 1 1 1 1 1 1 1 M I ! 1 1 1 1 1 1 1 1 1 M M 

orf 4 8-1 KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATHLNEPKSQKILFIVAESWGLPANP 
20 190 200 210 220 230 240 

250 260 270 280 290 300 

orf 4 8a . pep ELQNATFAKLLAQKXRFSVWESGSFPFIGATIEGEMRELCAYGGLRGFALRRAPDEKFAR 

II II II I II II II I 1 1 M M M I M M 1 1 MM M M M M M M M 1 1 M M M M M 

or f 4 8-1 ELQNATFAKLLAQKDRFSVWESGSFPFIGATVEGEMRELCAYGGLRGFALRRAPDEKFAR 
25 250 260 270 280 290 300 

310 320 330 340 350 360 

orf 4 8a . pep CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQEIKTAENLIGKKTCAIFGGVCDSE 

I II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i I i 1 1 II > 1 1 1 1 1 1 I II II 1 1 1 1 1 1 1 1 

orf 48 - 1 CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQEI KTAENLIGKKTCAI FGGVCDSE 

30 310 320 330 340 350 360 

370 380 390 400 410 420 

orf 4 8a . pep LFGEVSAXFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDXCRNFSLHTQ 

Mill MIIIIIMIIIIIIIIIIIIMI MUM IIIIMMI MINIM 

orf 48-1 LFGEVSAFFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDLCRNFSLHTQ 
35 370 380 390 400 410 420 

430 440 450 460 470 

orf 48a . pep F FDQLADL I QR P EMKGTE V I I VGDH P P P VGNLNET FR YL KQGHVXWLNF K I KX 

MMMMMMMMMMIMM MIMI IMMMIIM IIIIIMI 

or f 4 8 - 1 F FDQLADL I QRP EMKGTE VI I VGDH P P P VGNLNET FRYLKQGHVAWLNF K I KX 

40 430 440 450 460 470 

Homology with a predicted ORF from N. gonorrhoeae 

PRF48 (SEP ID NO: 468) shows 97.5% identity over a 119aa overlap with a predicted ORF 
(ORF48ng) (SEP ID NO: 474) from N. gonorrhoeae: 

orf 48 .pep MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARP I VNLDYLPAALLI 60 

45 | | | | : | | | : M i I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

orf4 8ng MNIHALLSEQWTLPPFLPKRLLLSLL I LLAPNAVFWVLALLTATARP I VNLDYLPAALLI 60 
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orf 4 8 . pep ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGL 119 

1 1 1 i 1 1 1 1 1 1 M 1 1 M I II M 1 1 1 1 M II 1 1 1 1 1 1 1 1 Ml 1 1 1 1 1 ! 1 1 1 1 M 1 1 1 

orf4 8ng ALPWRFVKIAGVLAFWPAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL 12 0 

The ORF48ng nucleotide sequence [<SEQ ID 473>] (SEP ID NO: 473) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 474>] (SEP ID NO: 474) : 

1 MNIHALLSEQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN 

51 LDYLPAALLI ALPWRFVKIA G VLAFWPAVL FDGLMMVI Q L FPFMDLIGAI 

101 NLVPFI LTAP APYQ IMTGLL LLYMLAMPFV L QKAAVKTDF RHIAVCAAW 

151 AAARYFTGPF ELLRTGGRWQ YVQHRRLLLS GSRASFRRRQ KADVLRRLGN 

2 01 PYASMGNGG . . 

Further work identified the complete gonococcal DNA sequence [<SEQ ID 475>] (SEP ID NO: 
475) : 



1 ATGAATATTC ACGCCCTGCT CTCCGAACAA TGGACGCTGC CGCCATTCCT 

51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTGGCC CCCAATGCGG 

101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT 

151 TTGGACTACC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTTTCGT 

2 01 CAAAATTGCC GGCGTATTGG CGTTTTGGCC GGCGGTTTTG TTTGACGGGC 
251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGACCTCAT CGGCGCCATC 
301 AACCTCGTCC CCTTCATCCT GACCGCCCCC GCCCCTTATC AGATAATGAC 

3 51 CGGGCTGTTG CTGCTGTATA TGCTGGCGAT GCCGTTTGTG TTGCAAAAAG 
401 CCGCCGTCAA AACCGACTTC CGACACATTG CCGTCTGTGC CGCCGTTGTG 
451 GCGGCAGCCG GCTATTTCAC CGGCCATTTG AGTTACTACG ACCGGGGGCG 
501 GATGGCCAAT ATCTTCGGCG CAAACAACTT CTATTACGCc aAAAGTCAGG 
551 CGATGCTCTA CACCGTCAGC CAGAATGCCG ACTTTATTAC CGCCGgcctG 
601 GTCGACCCCG TCTTCCTCCC CTTGGGCAAT CAGCAGCGTG CCGCCACGCG 
651 GCTGAGTGAG CCGAAATCTC AAAAAATCCT CTTTATCGTC GCCGAATCTT 
701 GGGGGCTGCC GGGCAATCCC GAGCTTCAAA ACGCCACTTT TGCCAAACTG 
751 CTGGCGCAAA AAGACCGTTT TTCGGTTTGG GAAAGCGGCA GTTTTCCCTT 
801 CATCGGCGCG ACGGTCGAAG GCGAAATGCG CGAATTGTGC GCCTACGGCG 
851 GTTTGCGCGG GTTCGCACTG CGCCGCGCGC CCGACGAAAA ATTTGCCCGC 
901 TGCCTCCCCA .ACCGTTTGAA ACAAGAAGGT TACGCCACCT TTGCGATGCA 
951 CGGCGCGGGT AGTTCGCTTT ACGACCGCTT CAGCTGGTAT CCGAGGGCGG 

1001 GCTTTCAAAA AATCAAAACC GCCGAAAACC TGATCGGTAA AAAAACCTGC 

1051 GCCATTTTCG GCGGCGTGTG CGACAGCGAG CTGTTCGGCG AAGTGTCGGC 

1101 ATTTTTCAAA AAACACGACA AGGGACTGTT TTACTGGATG ACGCTGACCA 

1151 GCCACGCCGA CTATCCCGAA TCCGACATTT TCAACCACAG GCTCAAATGC 

1201 ACCGAATACG GCCTGCCCGC CGAAACCGAC CTCTGCCGCA ATTTCAGCCT 

1251 GCACACCCAA TtcttcgACC AACTGGCGGA TTTGATCCGA CGCCCCGAAA 

13 01 TGAAAGGCAC GGAAGTCATC ATCGTCGGCG ACCATCCGCC GCCCGTCGGC 
1351 AACCTCAATG AAACCTTCCG CTACCTCAAA CAGGGACACG TCGCCTGGCT 

14 01 GCACTTCAAA ATCAAATAA 

This encodes a protein having amino acid sequence [<SEQ ID 476; ORF48ng-l>] (SEP ID NO: 
476; PRF48ng-l) : 



1 MNIHALLSEQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN 

51 LDYLPAALLI ALPWRFVKIA GVLAFWPAVL FDGLMMVIQL FPFMDLIGAI 

101 NLVPFILTAP APYQIMTGLL LLYMLAMPFV LQKAAVKTDF RHIAVCAAW 

151 AAAGYFTGHL SYYDRGRMAN IFGANNFYYA KSQAMLYTVS QNADFITAGL 

201 VDPVFLPLGN QQRAATRLSE PKSQKILFIV AESWGLPGNP ELQNATFAKL 
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251 LAQKDRFSVW ESGSFPFIGA TVEGEMRELC AYGGLRGFAL RRAPDEKFAR 

301 CLPNRLKQEG YATFAMHGAG SSLYDRFSWY PRAGFQKIKT AENLIGKKTC 

351 AIFGGVCDSE LFGEVSAFFK KHDKGLFYWM TLTSHADYPE SDIFNHRLKC 

401 TEYGLPAETD LCRNFSLHTQ FFDQLADLIR RPEMKGTEVI IVGDHPPPVG 

451 NLNETFRYLK QGHVAWLHFK IK* 



ORG48ng-l (SEP ID NO: 476) and ORF48-1 (SEP ID NO: 470) show 97.9% identity in 472 aa 
overlap: 



10 20 30 40 50 60 

1 0 orf 4 8 - 1 . pep MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI 

I I I: I I hi II I I I I ' I II I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf48ng-l MNIHALLSEQWTLPPFLPKRLLLSLLILLAPNAVFWLALLTATARPIVNLDYLPAALLI 

10 20 30 40 50 60 



70 80 90 100 110 120 

1 5 orf 4 8 - 1 . pep ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MUM IMIII MMMMMMMIIIIMIMIMM 

orf 4 8ng- 1 ALPWRFVKIAGVLAFWPAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL 

70 80 90 100 110 120 



130 140 150. 160 170 180 

20 or f 4 8 - 1 . pep LL YMLAM P FVLQKAAAKTD FRH I AVCAA WAAAG Y FTGHLS Y YDRGRMAN I FGANNFYYA 

M M M I MM I M MM M M I M M M M M M M M M M M M M M M M I M I 

orf4 8ng-l LL YMLAM PFVLQKAAVKTD FRH I AVCAA WAAAG YFTGHLSY YDRGRMAN I FGANNFYYA 

130 140 150 160 170 180 



190 200 210 220 230 240 

25 orf 4 8 - 1 . pep KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATHLNEPKSQKILFIVAESWGLPANP 

I : I I I I I I Mill I I I I I I I I I I I I I I i I I I h I : I I I I I I I I I I I I I I I I I = I I 
orf4 8ng-l KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATRLSEPKSQKILFIVAESWGLPGNP 

190 200 210 220 230 240 

250 260 270 280 290 300 

30 or f 4 8 - 1 . pep ELQNATFAKLLAQKDRFSVWESGSFPFIGATVEGEMRELCAYGGLRGFALRRAPDEKFAR 

M II II MM 1 1 M I Ml II III Ml I Ml I 1 1 1 Ml MM I III II I Ml 1 1 II II 

orf 4 8ng- 1 ELQNATFAKLLAQKDRFSVWESGSFPFIGATVEGEMRELCAYGGLRGFALRRAPDEKFAR 

250 260 270 280 290 300 



310 320 330 340 350 360 

35 orf 48-1 .pep CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQEIKTAENLIGKKTCAIFGGVCDSE 

II 1 1 1 II I M 1 1 1 1 1 1 1 M M 1 1 1 1 1 M I II 1 1 MM 1 1 1 1 I II 1 1 1 M 1 1 1 II M 1 1 

orf4 8ng-l CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQKIKTAENLIGKKTCAIFGGVCDSE 

310 320 330 340 350 360 



370 380 390 400 410 420 

40 orf 48-1 .pep LFGEVSAFFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDLCRNFSLHTQ 

I M 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 M I II M I M 1 1 II IM 1 1 1 1 MM 1 1 1 1 M 1 1 1 

orf 4 8ng- 1 LFGEVSAFFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDLCRNFSLHTQ 

370 380 390 400 410 420 



430 440 450 460 470 

45 orf 48-1 .pep FFDQLADLIQRPEMKGTEVIIVGDHPPPVGNLNETFRYLKQGHVAWLNFKIKX 

II M 1 1 1 1 1 M M I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 MM 1 1 1 

orf4 8ng-l FFDQLADLIRRPEMKGTEVIIVGDHPPPVGNLNETFRYLKQGHVAWLHFKIKX 

430 440 450 460 470 
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Based on this analysis, including the presence of a putative leader sequence (double-underlined) 
and two putative transmembrane domains (single-underlined) in the gonococcal protein, it is 
predicted that the proteins from ^meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 57 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 477>] (SEP ID 
NO: 477) : 



1 . . GTGAGCGGAC GTTACCGCGC TTTGGATCGC GTTTCCAAAA TCATCATCGT 

51 TACTTTGAGT ATCGCCACGC TTGCCGCCGC CGGCATCGCT ATGTCGCGCG 

101 GTATGCAGAT GCAGTCCGAT TTTATCGAGC CGACACCGTG GACGCTTGCC 

151 GGTTTGGGCT TCCTGATCGC GCTGATGGGC TGGATGCCCG CGCCGATTGA 

2 01 AATTTCCGCC ATCAATTCTT TGTGGGTAAC CGAAAAACAA CGCATCAATC 
251 CTTCCGAATA CCGCGACGGG ATTTTTGAAT TCAACGTCGG TTATATCGCC 

3 01 AGTGCGGTTT TGGCTTTGGT TTTCCTTGCA CTGGGCGC.G TAGCGCCGAA 

3 51 CGGCAACGGC GA . ACAGTGC AGATGGCGGG CGGCAAATAT AACGGGCAAT 

4 01 TGATCAATAT GTACGCC . . 

This corresponds to the amino acid sequence [<SEQ ID 478; ORF53>] (SEP ID NO: 478; 
ORF53) : 



1 . . VSGRYRALDR VSKIIIVTLS IATLAAAGIA MSRGMQMQSD FIEPTPWTLA 
51 GLGFLIALMG WMPAPIEISA INSLWVTEKQ RINPSEYRDG IFEFNVGYIA 
101 SAVLALVFLA LGXVAPNGNG XTVQMAGGKY NGQLINMYA. . 

Further work revealed the complete nucleotide sequence [<SEQ ID 479>] (SEP ID NP: 479) : 



1 ATGTCCGAAC AACATATTTC GACTTGGAAA AGTAAAATCA ACGCATTGGG 

51 TCCGGGGATC ATGATGGCTT CGGCGGCGGT CGGCGGTTCG CACCTGATTG 

101 CCTCGACGCA GGCGGGCGCG CTTTACGGCT GGCAGATCGC GCTCATCATC 

151 ATCCTGACCA ACCTCTTCAA ATACCCGTTT TTCCGCTTCA GCGCGCATTA 

2 01 CACGCTGGAC ACGGGCAAGA GCCTGATTGA AGGTTATGCC GAGAAAAGCC 
251 GCGTTTATTT GTGGGTATTC CTGATTTTGT GCATCCTCTC CGCCACGATT 
301 AACGCGGGCG CGGTCGCCAT TGTAACCGCC GCCATCGTCA AAATGGCGAT 

3 51 TCCCTCGCTG ATGTTTGATG CCGGCACGGT TGCCGCCTTG ATTATGGCAT 

4 01 CCTGCCTGAT TATTTTGGTG AGCGGACGTT ACCGCGCTTT GGATCGCGTT 
4 51 TCCAAAATCA TCATCGTTAC TTTGAGTATC GCCACGCTTG CCGCCGCCGG 
501 CATCGCTATG TCGCGCGGTA TGCAGATGCA GTCCGATTTT ATCGAGCCGA 
551 CACCGTGGAC GCTTGCCGGT TTGGGCTTCC TGATCGCGCT GATGGGCTGG 
601 ATGCCCGCGC CGATTGAAAT TTCCGCCATC AATTCTTTGT GGGTAACCGA 
651 AAAACAACGC ATCAATCCTT CCGAATACCG CGACGGGATT TTTGATTTCA 
701 ACGTCGGTTA TATCGCCAGT GCGGTTTTGG CTTTGGTTTT CCTTGCACTG 
751 GGCGCGTTTG TGCAATACGG CAACGGCGAA GCAGTGCAGA TGGCGGGCGG 
801 CAAATATATC GGGCAATTGA TCAATATGTA CGCCGTTACC ATCGGCGGCT 
851 GGTCGCGCCC GCTGGTGGCG TTTATCGCGT TTGCCTGTAT GTACGGCACG 
901 ACGATTACCG TCGTGGACGG CTATGCCCGT GCCATTGCCG AACCCGTGCG 
951 CCTGCTGCGC GGAAAAGACA AAACGGGCAA CGCCGAATTC TTTGCCTGGA 

1001 ATATTTGGGT GGCGGGCAGC GGTTTGGCGG TGATTTTCTG GTTTGACGGC 
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1051 GTAATGGCGA ATCTGCTCAA ATTTGCGATG ATTGCCGCTT TTGTGTCCGC 

1101 CCCTGTGTTT GCCTGGCTGA ATTACCGTTT GGTTAAAGGT GATGAAAAAC 

1151 ACAAACTCAC ATCAGGTATG AATGCCCTTG CATTGGCAGG CTTGATTTAT 

1201 CTGACCGGTT TTACCGTTTT GTTCTTATTG AATTTGGCGG GAATGTTCAA 

1251 ATGA 

This corresponds to the amino acid sequence [<SEQ ID 480; ORF53-l>] (SEP ID NO: 480; 
PRF53-1) : 

1 MSEQHISTWK SKINALGPGI MMASAAVGGS HLIASTQA GA LYGWQIALII 

51 ILTNLFKYPF FRFSAHYTLD TGKSLIEGYA EKSRVYL WVF LILCILSATI 

101 NAGAVAIVTA AIVKMAIPSL MFDAGTVAAL IMASCLIILV SGRYRALDRV 

151 S KIIIVTLSI ATLAAAGIAM SRGMQMQSDF IEPTPW TLAG LGFLIALMGW 

201 MPAPIEISAI NSLWVTEKQR INPSEYRDGI FDFNVGY IAS AVLALVFLAL 

251 GAFV QYGNGE AVQMAGGKYI GQLINMYAVT IGGWSRPL VA FIAFACMYGT 

301 TITWD GYAR AIAEPVRLLR GKDKTGNAE F FAWNIWVAGS GLAVIFW FDG 

351 VMAN LLKFAM IAAFVSAPVF A WLNYRLVKG DEKHKLTSGM NA LALAGLIY 

4 01 LTGFTVLFL L NLAGMFK* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N.meningitidis (strain A) 

ORF53 (SEP ID NO: 478) shows 93.5% identity over a 139aa overlap with an ORF (ORF53a) 
(SEP ID NO: 482) from strain A of N. meningitidis: 

10 20 30 

orf 53 . pep VSGRYRALDRVS KI I IVTLS IATLAAAGIA 

Illlllllllllllilllllllllllllll 
orf 53a AAI VKMAI PS LMFDAGTVAAL I MAS CL 1 1 L VSGRYRALDRVSKI I IVTLS IATLAAAGIA 

110 120 130 140 150 160 

40 50 60 70 80 90 

orf 53 .pep MSRGMQMQSDFIEPTPW TLAGLGFLIALMGWMPA PIEISAINSLWVTEKQRINPSEYRDG 

IIIIIIMIIIIIIIIIIIIIIIII I IIIIIIIIMIIIMIMM;! IIIIMMI 

orf 53a MSRGMQMQSDFIEPTPW TLAGLGFLIALMGWMPA PIEISAINSLWVTEKQRINPSEYRDG 
170 180 190 200 210 220 

100 110 120 130 139 

orf 53 . pep I FEFNVGY IASAVLALVFLALGXVA PNGNGXTVQMAGGKYNGQLINMYA 

|:| MIIIIIMIIMIII : Ml :|||||||| llllllll 
orf53a IFD FNVGY IASAVLALVFLALGAFV QYGNGEAVQMAGGKY I GQL INMYAVT I GGWSRPLV 

230 240 250 260 270 280 

orf 53a AFIAFACMYGTTITWD GYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFD 
290 300 310 320 330 340 

The complete length PRF53a nucleotide sequence [<SEQ ID 48 1>] (SEPIDNP: 481) is: 

1 ATGTCCGAAC AACATATTTC GACTTGGAAA AGTAAAATCA ACGCATTGGG 

51 ACCGGGGATT ATGATGGCTT CGGCGGCGGT CGGCGGTTCG CACCTGATTG 

101 CCTCGACGCA GGCGGGCGCG CTTTACGGCT GGCAGATCGC GCTCATCATC 

151 ATCCTGACCA ACCTCTTCAA ATACCCGTTT TTCCGCTTCA GCGCGCATTA 
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201 CACGCTGGAC ACGGGCAAGA GCCTGATTGA AGGTTATGCC GAGAAAAGCC 

251 GCGTTTATTT GTGGGTATTC CTGATTTTGT GCATCCTCTC CGCCACGATT 

301 AACGCGGGCG CGGTCGCCAT TGTAACCGCC GCCATCGTCA AAATGGCGAT 

351 TCCCTCGCTG ATGTTTGATG CCGGCACGGT TGCCGCCTTG ATTATGGCAT 

4 01 CCTGCCTGAT TATTTTGGTG AGCGGACGTT ACCGCGCTTT GGATCGCGTT 

451 TCCAAAATCA TCATCGTTAC TTTGAGTATC GCCACGCTTG CCGCCGCCGG 

501 CATCGCTATG TCGCGCGGTA TGCAGATGCA GTCCGATTTT ATCGAGCCGA 

551 CACCGTGGAC GCTTGCCGGT TTGGGCTTCC TGATCGCGCT GATGGGCTGG 

601 ATGCCCGCGC CGATTGAAAT TTCCGCCATC AATTCTTTGT GGGTAACCGA 

651 AAAACAACGC ATCAATCCTT CCGAATACCG CGACGGGATT TTTGATTTCA 

701 ACGTCGGTTA TATCGCCAGT GCGGTTTTGG CTTTGGTTTT CCTTGCACTG 

751 GGCGCGTTTG TGCAATACGG CAACGGCGAA GCAGTGCAGA TGGCGGGCGG 

801 CAAATATATC GGGCAATTGA TCAATATGTA CGCCGTTACC ATCGGCGGCT 

851 GGTCGCGCCC GCTGGTGGCG TTTATCGCGT TTGCCTGTAT GTACGGCACG 

901 ACGATTACCG TTGTGGACGG CTATGCCCGT GCCATTGCCG AACCCGTGCG 

951 CCTGCTGCGC GGAAAAGACA AAACGGGCAA CGCCGAATTC TTTGCCTGGA 

1001 ATATTTGGGT GGCGGGCAGC GGTTTGGCGG TGATTTTCTG GTTTGACGGC 

1051 GTAATGGCGA ATCTGCTCAA ATTTGCGATG ATTGCCGCTT TTGTGTCCGC 

1101 CCCTGTGTTT GCCTGGCTGA ATTACCGTTT GGTCAAAGGT GATGAAAAAC 

1151 ACAAACTCAC ATCAGGTATG AATGCCCTTG CATTGGCAGG CTTGATTTAT 

1201 CTGACCGGTT TTACCGTTTT GTTCTTATTG AATTTGGCGG GAATGTTCAA 

1251 ATGA 



This encodes a protein having amino acid sequence [<SEQ ID 482>] (SEP ID NO: 482) : 



1 MSEQHISTWK SKINALGPGI MMASAAVGGS HLIASTQAG A LYGWQIALII 

51 ILTNLF KYPF FRFSAHYTLD TGKSLIEGYA EKSRVYLWVF LILCILSATI 

101 NAGAVAIVTA AIVKMAIPSL MFDAGTVAAL IMASCLIILV SGRYRALDRV 

151 SK IIIVTLSI ATLAAAGIAM SRGMQMQSDF IEPTPW TLAG LGFLIALMGW 

201 MPA PIEISAI NSLWVTEKQR INPSEYRDGI FDFNVGY IAS AVLALVFLAL 

251 GAFV QYGNGE AVQMAGGKY I GQLINMYAVT IGGWSRPL VA FIAFACMYGT 

301 TITWD GYAR AIAEPVRLLR GKDKTGNAE F FAWNIWVAGS GLAVIF WFDG 

351 VMAN LLKFAM IAAFVSAPVF A WLNYRLVKG DEKHKLTSGM N ALALAGLIY 

4 01 LTGFTVLFLL NLAGMFK* 



ORF 53a (SEP ID NO: 482) shows 100.0% identity in 417 aa overlap with PRF53-1 fSEP ID NP: 
480) : 



10 20 30 40 50 60 

orf 53a. pep MSEQHISTWKSKINALGPGIMMASAAVGGSHLIASTQAGALYGWQIALI I ILTNLFKYPF 

1 1 1 : 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 Ml 1 1 1 1 ■ 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 

orf 53-1 MSEQHISTWKSKINALGPGIMMASAAVGGSHLIASTQAGALYGWQIALII ILTNLFKYPF 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 53a. pep FRFSAHYTLDTGKSLIEGYAEKSRVYLWVFLILCILSATINAGAVAIVTAAIVKMAIPSL 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1' i 1 1 1 1 II 1 1 1 1 1 M ! 1 1 1 1 1 1 1 1 1 1 ' ! 1 1 1 1 1 1 1 1 1 1 1 1 

orf 53-1 FRFSAHYTLDTGKSLIEGYAEKSRVYLWVFLILCILSATINAGAVAIVTAAIVKMAIPSL 

70 80 90 . 100 110 120 



130 140 150 160 170 180 

orf 53a . pep MFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIAMSRGMQMQSDF 

MINIMI MM I II I II II MINIMI MINIM llllllllllllll MINIM 

orf 53-1 MFDAGTVAAL I MAS CL I ILVSGRYRALDRVSKI 1 1 VTLS I ATLAAAGIAMSRGMQMQSDF 

130 140 150 160 170 180 

190 200 210 220 230 240 
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orf53a.pep I EPTPWTLAGLG FL I ALMGWM PAP I E I S A INSLWVTE KQR I N P S E YRDG I FD FNVGY I AS 

I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 53 - 1 IEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGIFDFNVGYIAS 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 53a . pep AVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLVAFIAFACMYGT 

1 1 1 1 II I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I M 1 1 1 1 1 1 1 1 1 1 1 1 ; 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 

orf 53 - 1 AVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLVAFIAFACMYGT 

250 260 270 280 290 300 



10 



310 320 330 340 350 360 

orf 53a . pep TITVVDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFDGVMANLLKFAM 

I 1 1 Ml 1 1 1 I M 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 T 1 1 M I 1 1 1 1 1 1 1 1 1 I 

orf 53 - 1 TITVVDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWAGSGLAVIFWFDGVMANLLKFAM 

310 320 330 340 350 360 



15 



370 380 390 400 410 

orf 53a . pep IAAFVSAPVFAWLNYRLVKGDEKHKLTSGMNALALAGLIYLTGFTVLFLLNLAGMFKX 

I I I I I I I I I • I I I I I I I I I I I I Ih I I I M II I I I I I II I I I I I I I I I I I I I I I I , 
orf 53 - 1 IAAFVSAPVFAWLNYRLVKGDEKHKLTSGMNALALAGLIYLTGFTVLFLLNLAGMFKX 

370 380 390 400 410 



20 Homology with a predicted ORF from N. gonorrhoeae 

ORF53 (SEP ID NO: 478) shows 92.1% identity over a 139aa overlap with a predicted ORF 
(PRF53ng) (SEP ID NO: 484) from N. gonorrhoeae: 



25 



30 



orf 53 .pep VSGRYRALDRVSKI I I VTLS IATLAAAGIA 30 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 

orf 53ng AAIVKMAI PSLMFDAGTVAALIMASCLI ILVSGRYRALDRVSKI 1 1 VTLS IATLAAAGIA 91 

orf 53 . pep MSRGMQMQSDFI EPTPWTLAGLGFL I ALMGWMPAP I E I S AINSLWVTEKQRINPSEYRDG 90 

MINIM I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 

orf 53ng MSRGMQMQPDF I EPTPWTLAGLGFL I ALMGWMPAP I EISAINSLWVTEKQRINPSEYRDG 151 

orf 53 . pep I FE FNVGY I AS AVLALVFLALGXVAPNGNGXTVQMAGGKYNGQL I NM YA 13 9 

|:| II I III I Mill I III : Ml Ml hi I II Ml II III 

orf 53ng. I FD FNVGY I ASAVLALVFLALGAFVQYGNGEAVQMGGGKY I GQL INMYAVT I GGGSRPLV 211 



An PRF53ng nucleotide sequence [<SEQ ID 483>] fSEP ID NP: 483) was predicted to encode a 
protein having amino acid sequence (SEQ ID NP: 484): 



35 



40 



i 

51 
101 
151 
201 
251 



MPKKSCVYLW VFLILCIASA TINAGAVAIV TAAIVKMAIP 
ALIMASCLII LVSGRYRALD RVSKIIIVTL SIATLAAAGI 



DFIEPTPW TL AGLGFLIALM GWMPA PIEIS 
GIFDFNVGY I ASAVLALVFL ALGAFV QYGN 
VTIGGGSRPL VAFIAFACMY GAASTWD GY 
IVLLEKLGGR HRFGRDFLV* 



AINSLWVTEK 
GEAVQMGGGK 
ARAIAEPVRL 



SLMFDAGTVA 
AMSRGMQMQP 
QRINPSEYRD 
YIGQLINMYA 
LRGKDKTARP 



Further analysis revealed further partial DNA gonococcal sequence [<SEQ ID 485>] fSEP ID NP: 



485) : 
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1 . . aagaAAAGCT GCGTTTATTT GTGGGTTTTT TTGATTTTGT GTATCGCCTC 

51 CGCCACGATT AACGCGGGCG CGGTCGCCAT TGTAACCGCC GCCATCGTCA 

101 AAATGGCGAT TCCCTCGCTG ATGTTTGATG CCGGCACGGT TGCCGCCTTG 

151 ATTATGGCAT CCTGCCTGAT TATTTTGGTG AGCGGACGTT ACCGCGCTTT 

2 01 GGATCGTGTT TCCAAAATCA TCATTGTTAC TTTGAGCATC GCCACGCTTG 

2 51 CCGCCGCCGG CATCGCTATG TCGCGCGGTA TGCAGATGCA GCCCGATTTT 

3 01 ATCGAGCCGA CACCGTGGAC GCTTGCCGGT TTGGGCTTCC TGATCGCGCT 
351 GATGGGCTGG ATGCCCGCGC CGATCGAAAT TTCCGCCATC AATTCTTTGT 

4 01 GGGTAACCGA AAAACAACGC ATCAATCCTT CTGAATACCG CGACGGGATT 
4 51 TTCGATTTCA ACGTCGGTTA TATCGCcagT GCGGTTTTGG CTTTGGTTTT 
501 CCTTGCACTG GGCGCGTTTG TGCAATACGG CAACGGCGAA GCAGTGCAGA 
551 TGGCGGGCGG CAAATATATC GGGCAATTGA TTAATATGTA TGCCGTAACC 
601 ATCGGCGGCT GGTCTCGTCC GCTGGTGGCG TTTATCGCGT TTGCCTGTAT 
651 GTACGGCACG ACGATTACCG TTGTGGACGG TTATGCGCGT GCCATTGCCG 
701 AACCCGTGCG CCTGCTGCGC GGCAGGGATA AAACCGGCAA CGCCGAGTTG 
751 TTtgccTGGA ATATTTGGGT GGCGGGCAGC GGTTTGGCGG TGATTTTCTG 
8 01 GTTTGACggc gcaaTGGCgG AACtgcTCAA ATTTGCGATG ATtgccgcCT 
851 TTGTGTCCGC CCCTGTGTTC GCCTGGCTCA ACTACCGCCT CGTCAAAGGG 
901 GACAAACGCC ACAGGCTTAC CGCCGGTATG AACGCCCTTG CCATTGTCGG 
951 CCTGCTCTAC CTGGCCGGGT TTGCCGTTTT GTTCCTGTTG AACCTTACCG 

1001 GACTTTTGGC ATAG 

This corresponds to the amino acid sequence [<SEQ ID 486; ORF53ng-l>] (SEP ID NO: 486; 
PRF53ng-l) : 



1 . .KKSCVYLWVF LILCIASATI NAGAVAIVTA AIVKMAIPSL MFDAGTVAAL 

51 IMASCLIILV SGRYRALDRV S KI I IVTLS I ATLAAAGIAM SRGMQMQPDF 

101 IEPTPW TLAG LGFLIALMGW MPA PIEISAI NSLWVTEKQR INPSEYRDGI 

151 FDFNVGY IAS AVLALVFLAL GAFV QYGNGE AVQMAGGKYI GQLINMYAVT 

201 IGGWSRPL VA FIAFACMYGT TITWD GYAR AIAEPVRLLR GRDKTGNAEL 

251 FAWNIWVAGS GLAVIFWFDG AMAELLKFAM IAAFVSAPVF AWLNYRLVKG 

3 01 DKRHRLTAGM N ALAIVGLLY LAGFAVLFL L NLTGLLA* 

ORF53ng-l (SEP ID NO: 486) and ORF53-1 (SEP ID NO: 480) show 94.0% identity in 336 aa 
overlap: 



60 70 80 90 100 110 

orf 53-1 .pep ILTNLFKYPFFRFSAHYTLDTGKSLIEGYAEKSRVYLWVFLILCILSATINAGAVAIVTA 

:| lllllllllll I I I I I I , I I I I I M 
orf53ng-l KKSCVYLWVFLILCIASATINAGAVAIVTA 

10 20 30 

120 130 140 150 160 170 

orf 53-1 .pep AIVKMAI PSLMFDAGTVAALIMASCLI ILVSGRYRALDRVSKI I IVTLS I ATLAAAGIAM 

II M 1 1 MM 1 1 1 1 1 II 1 1 1 II IN II I II I II N II M M 1 1 M 1 1 1 Mi II 1 1 1 II 

orf 53ng-l AIVKMAI PSLMFDAGTVAALIMASCLI ILVSGRYRALDRVSKI I IVTLS I ATLAAAGIAM 

40 50 60 70 80 90 



180 190 200 210 220 230 

orf 53-1 .pep SRGMQMQSDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGI 

IMIIII III MMMMMMMMMMMMIMMMIMMMIMI III 

orf53ng-l SRGMQMQPDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGI 

100 110 120 130 140 150 



orf 53 -1 .pep 



240 250 260 270 280 290 

FDFNVGY I AS AVLALVFLALGAFVQYGNGEAVQMAGGKY I GQLINMYAVT IGGWSRPLVA 
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lllllll IMIIIMII IIIMII1IIIMMII IIMIII IIMIIIIIMIIMIIII 

orf 53ng- 1 FDFNVGY I AS AVLALVFLALGAFVQYGNGEAVQMAGGKY I GQL I NMYAVT I GGWSRPLVA 

160 170 180 190 200 210 

300 310 320 330 340 350 

orf 53-1 .pep FIAFACMYGTTITWDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFDG 

MIIIIIIMIIMIIIIIII IIMMIIIhMIIII MIIIIIIMI I MINIMI 

orf53ng-l F I AFACM YGTT I TWDGYARA I AE PVRLLRGRDKTGNAELFAWN I WVAGSGLAV I FWFDG 

220 230 240 250 260 270 
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360 370 380 390 400 410 

orf 53 - 1 . pep VMANLLKFAMIAAFVSAPVFAWLNYRLVKGDEKHKLTSGMNALALAGLIYLTGFTVLFLL 

:| I : I I I I M I M I I M I I I I I I I I I M h ^ M hi I M M : : I I : I h I h I I I I 
orf53ng-l AMAELLKFAMIAAFVSAPVFAWLNYRLVKGDKRHRLTAGMNALAIVGLLYLAGFAVLFLL 

280 290 300 310 320 330' 



15 



orf 53-1 .pep NLAGMFKX 

I I : I - 
orf53ng-l NLTGLLAX 



20 



Based on this analysis, including the presence of a putative leader sequence (double-underlined) 
and several putative transmembrane domains (single-underlined) in the gonococcal protein, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 58 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 487>] (SEP ID 
NO: 487) : 



25 



30 



i 

51 
101 
151 
201 
251 
301 



. TTGCGGGAAA 
TGCGCTTGCC 
TGCGCGAGGT 
CTGCCTGAAA 
GCTTTTCCAC 
TCCGTTTCTG 
GTTCCGCCT . . 



CGGCATATGT 
GGCTTGTTTT 
TTCTGCGTGG 
TCAAAGACGG 
GCCGTCAAAA 
CCGAAACTAT 



TTTGGATAGT 
TTGTCCGCGC 
CAGGAAAAGA 
TATGCCCGAT 
CGGCAGTGTA 
CTGGCGCACG 



TTTGATCGTT 
ACAATCCGAA 
AAGGGGAAAA 
TTTCCCGAAC 
TTGGCTGTTT 
AATCCGAACC 



ATTTTGTTGT 
CGCGAGTGGA 
ACAGGCGGAG 
TTGCCCTGAT 
GTCGGTGTCG 
GGACAGGCCC 



This corresponds to the amino acid sequence [<SEQ ID 488; ORF58>] (SEP ID NO: 488: 
ORF58): 



35 1 . . LRETAYVLDS FDRYFWALA GLFFVRAQSE REWMREVSAW QEKKGEKQAE 

51 LPEIKDGMPD FPELALML FH AVKTAVYWLF VGW RFCRNY LAHESEPDRP 
101 VPP.. 

Further work revealed the complete nucleotide sequence [<SEQ ID 489>] (SEP ID NO: 489) : 



40 1 ATGTTTTGGA TAGTTTTGAT CGTTATTTTG TTGCTTGCGC TTGCCGGCTT 
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51 GTTTTTTGTC CGCGCACAAT CCGAACGCGA GTGGATGCGC GAGGTTTCTG 

101 CGTGGCAGGA AAAGAAAGGG GAAAAACAGG CGGAGCTGCC TGAAATCAAA 

151 GACGGTATGC CCGATTTTCC CGAACTTGCC CTGATGCTTT TCCATGCCGT 

201 CAAAACGGCA GTGTATTGGC TGTTTGTCGG TGTCGTCCGT TTCTGCCGAA 

5 251 ACTATCTGGC GCACGAATCC GAACCGGACA GGCCCGTTCC GCCTGCTTCT 

301 GCAAACCGTG CGGATGTTCC GACCGCATCC GACGGATATT CAGACAGTGG 

351 AAACGGGACG GAAGAAGCGG AAACGGAAGA AGCAGAAGCT GCGGAGGAAG 

4 01 AGGCTGCCGA TACGGAAGAC ATTGCAACTG CCGTAATCGA CAACCGCCGC 

4 51 ATCCCATTCG ACCGGAGTAT TGCTGAAGGG TTGATGCCGT CTGAAAGCGA 

]0 501 AATTTCGCCC GTCCGTCCGG TTTTTAAAGA AATCACTTTG GAAGAAGCAA 

551 CGCGTGCTTT AAACAGCGCG GCTTTAAGGG AAACGAAAAA ACGCTATATC 

601 GATGCATTTG AGAAAAACGA AACAGCGGTC CCCAAAGTCC GCGTGTCCGA 

651 TACCCCGATG GAAGGGCTGC AGATTATCGG TTTGGACGAC CCTGTGCTTC 

701 AACGCACGTA TTCCCATATG TTCGATGCGG ACAAAGAAGC GTTTTCCGAG 

15 751 TCTGCGGATT ACGGATTTGA GCCGTATTTT GAGAAGCAGC ATCCGTCTGC 

801 CTTTTCTGCA GTCAAAGCCG AAAATGCACG GAATGCGCCG TTCCACCGTC 

8 51 ATGCAGGGCA GGGGAAAGGG CAGGCGGAGG CAAAATCCCC GGATGTTTCC 

901 CAAGGGCAGT CCGTTTCAGA CGGCACGGCC GTCCGCGATG CCCGCCGCCG 

951 CGTTTCCGTC AATTTGAAAG AACCGAACAA GGCAACGGTT TCTGCGGAGG 

20 1001 CGCGAATTTC TCGCCTGATT CCGGAAAGTC AGACGGTTGT CGGGAAACGG 

1051 GATGTCGAAA TGCCGTCTGA AACCGAAAAT GTTTTCACGG AAACCGTTTC 

1101 GTCTGTGGGA TACGGCGGTC CGGTTTATGA TGAAACTGCC GATATCCATA 

1151 TTGAAGAACC TGCCGCGCCC GATGCTTGGG TGGTCGAACC ACCCGAAGTG 

12 01 CCGAAAGTTC CCATGACCGC AATCGATATT CAGCCGCCGC CTCCCGTATC 
25 1251 GGAAATCTAC AACCGTACCT ATGAACCGCC GTCAGGATTC GAGCAGGTGC 

13 01 AACGCAGCCG CATTGCCGAG ACCGACCATC TTGCCGATGA TGTTTTGAAT 
1351 GGAGGTTGGC AGGAGGAAAC CGCCGCTATT GCGGATGACG GCAGTGAAGG 

14 01 TGCGGCAGAG CGGTCAAGCG GGCAATATCT GTCGGAAACC GAAGCGTTCG 
14 51 GGCATGACAG TCAGGCGGTT TGTCCGTTTG AAAATGTGCC GTCTGAACGC 

30 1501 CCGTCCTGCC GGGTATCGGA TACGGAAGCG GATGAAGGGG CGTTCCCATC 
1551 . TGAAGAAACC GGTGCGGTAT CCGAACACCT GCCGACAACC GACCTGCTTC 

1601 TGCCTCCGCT GTTCAATCCC GAGGCGACGC AAACCGAAGA AGAACTGTTG 

1651 GAAAACAGCA TCACCATCGA AGAAAAATTG GCGGAGTTCA AAGTCAAGGT 

1701 CAAGGTTGTC GATTCTTATT CCGGCCCCGT AATTACGCGT TATGAAATCG 

35 1751 AACCCGATGT CGGCGTGCGC GGCAATTCCG TTCTGAATCT GGAAAAAGAT 

1801 TTGGCGCGTT CGCTCGGCGT GGCTTCCATC CGCGTTGTCG AAACCATCCC 

1851 CGGCAAAACC TGCATGGGTT TGGAACTTCC GAACCCGAAA CGCCAAATGA 

1901 TACGCCTGAG CGAAATCTTC AATTCGCCCG AGTTTGCCGA ATCCAAATCC 

1951 AAGCTGACGC TCGCGCTCGG TCAGGACATC ACCGGACAGC CCGTCGTAAC 

40 2001 CGACTTGGGA AAAGCACCGC ATTTGTTGGT TGCCGGCACG ACCGGTTCGG 

2051 GCAAATCGGT GGGTGTCAAC GCGATGATTC TGTCTATGCT TTTCAAAGCC 

2101 GCGCCGGAAG ACGTGCGTAT GATTATGATC GATCCGAAAA TGCTGGAATT 

2151 GAGCATTTAC GAAGGCATCC CGCACCTGCT CGCCCCTGTC GTTACCGATA 

22 01 TGAAGCTGGC GGCAAACGCG CTGAACTGGT GTGTTAACGA AATGGAAAAA 

45 22 51 CGCTACCGCC TGATGAGCTT TATGGGCGTG CGTAATCTTG CGGGCTTCAA 

2301 TCAAAAAATC GCCGAAGCCG CAGCAAGGGG AGAAAAAATC GGCAATCCGT 

2351 TCAGCCTCAC GCCCGACGAT CCCGAACCTT TGGAAAAACT GCCGTTTATC 

24 01 GTGGTCGTGG TCGATGAGTT TGCCGACCTG ATGATGACGG CAGGCAAGAA 

24 51 AATCGAAGAA CTGATTGCCC GCCTCGCCCA AAAAGCCCGC GCGGCAGGCA 

50 2501 TCCATTTGAT TCTTGCCACA CAACGCCCCA GCGTCGATGT CATCACGGGT 

2551 CTGATTAAGG CGAACATCCC GACGCGTATC GCGTTCCAAG TGTCCAGCAA 

2601 AATCGACAGC CGCACGATTC TCGACCAAAT GGGCGCGGAA AACCTGCTCG 

2651 GTCAGGGCGA TATGCTGTTC CTGCTGCCGG GTACTGCCTA TCCGCAGCGC 

2 701 GTTCACGGCG CGTTTGCCTC GGATGAAGAG GTGCACCGCG TGGTCGAATA 

55 2751 TTTGAAACAG TTTGGCGAAC CGGACTATGT TGACGATATT TTGAGCGGCG 

2 801 GCGGCAGCGA AGAGCTGCCC GGCATCGGGC GCAGCGGCGA CGACGAAACC 

2 851 GATCCGATGT ACGACGAGGC CGTATCCGTT GTCCTGAAAA CGCGCAAAGC 

2901 CAGCATTTCG GGCGTACAGC GCGCCTTGCG TATCGGCTAC AACCGCGCCG 

2 951 CGCGTCTGAT TGACCAGATG GAGGCGGAAG GCATTGTGTC CGCACCGGAA 

60 3 001 CACAACGGCA ACCGTACGAT TCTCGTCCCC TTGGACAATG CTTGA 
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This corresponds to the amino acid sequence [<SEQ ID 490; ORF58-l>] (SEP ID NO: 490; 
PRF58-1) : 



1 MFWIVLIVIL LLALAGLFFV RAQS EREWMR EVSAWQEKKG EKQAELPEIK 
51 DGMPDFPELA LM LFHAVKTA VYWLFVGWR FCRNYLAHES EPDRPVPPAS 
101 ANRADVPTAS DGYSDSGNGT EEAETEEAEA AEEEAADTED IATAVIDNRR 
151 IPFDRSIAEG LMPSESEISP VRPVFKEITL EEATRALNSA ALRETKKRYI 
201 DAFEKNETAV PKVRVSDTPM EGLQI IGLDD PVLQRTYSHM FDADKEAFSE 
251 SADYGFEPYF EKQHPSAFSA VKAENARNAP FHRHAGQGKG QAEAKSPDVS 
301 QGQSVSDGTA VRDARRRVSV • NLKEPNKATV SAEARISRLI PESQTWGKR 
351 DVEMPSETEN VFTETVSSVG YGGPVYDETA DIHIEEPAAP DAWWEPPEV 
401 PKVPMTAIDI QPPPPVSEIY NRTYEPPSGF EQVQRSRIAE TDHLADDVLN 
451 GGWQEETAAI ADDGSEGAAE RSSGQYLSET EAFGHDSQAV CPFENVPSER 
501 PSCRVSDTEA DEGAFPSEET GAVSEHLPTT DLLLPPLFNP EATQTEEELL 
551 ENSITIEEKL AEFKVKVKW DSYSGPVITR YEIEPDVGVR GNSVLNLEKD 
601 LARSLGVASI RWETIPGKT CMGLELPNPK RQMIRLSEIF NSPEFAESKS 
651 KLTLALGQDI TGQPWTDLG KAPHLLVAGT TGSGKSVGVN AMILSMLFKA 
701 APEDVRMIMI DPKMLELSIY EGIPHLLAPV VTDMKLAANA LNWCVNEMEK 
751 RYRLMSFMGV RNLAGFNQKI AEAAARGEKI GNPFSLTPDD PEPLEKLPFI 
801 WWDEFADL MMT AGKKIEE LIARLAQKAR AAGIHLILAT QRPSVDVITG 
851 LIKANIPTRI AFQVSSKIDS RTILDQMGAE NLLGQGDMLF LLPGTAYPQR 
901 VHGAFASDEE VHRWEYLKQ FGEPDYVDDI LSGGGSEELP GIGRSGDDET 
951 DPMYDEAVSV VLKTRKAS I S GVQRALRIGY NRAARLIDQM EAEGIVSAPE 
1001 HNGNRTILVP LDNA* 

Computer analysis of this amino acid sequence predicts the indicated transmembrane region, and 
also gave the following results: 

Homology with a predicted ORF from N.menimitidis (strain A) 

ORF58 (SEP ID NO: 488) shows 96.6% identity over a 89aa overlap with an ORF (ORF58a) 
(SEP ID NO: 492) from strain A of N. meningitidis: 



10 20 30 40 50 60 

orf 58 . pep LRETAYVLDSFDRYF'V VALAGLFFVRAQS EREWMREVSAWQEKKGEKQAELPEIKDGMPD 

:: :| I I I I U I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I i I I I 
orf 58a MFWIVLIVILLLALAGLFFVRAQS EREWMREVSAWQEKKGEKQAELPEIKDGMPD 

10 20 30 40 50 



70 80 ' 90 100 

orf 5 8 . pep FPELALM LFHAVKTAVYWLFVGW RFCRNYLAHESEPDRPVPP 

I I I I I I I I I 1 I I I I I I I I I I I I M I M I I I I I I I I I I I I 
orf 58a FPELALM LFHAVKTAVYWLFVGW RFCRNYLiAHESEPDRPVPPASANRADVPTASDGYSD 
60 70 80 90 100 110 



The complete length PRF58a nucleotide sequence [<SEQ ID 491>] (SEPIDNP: 491) is: 



1 ATGTTTTGGA TAGTTTTGAT CGTTATTTTG TTGCTTGCGC TTGCCGGCTT 

51 GTTTTTTGTC CGCGCACAAT CCGAACGCGA GTGGATGCGC GAGGTTTCTG 

101 CGTGGCAGGA AAAGAAAGGG GAAAAACAGG CGGAGCTGCC TGAAATCAAA 

151 GACGGTATGC CCGATTTTCC CGAACTTGCC CTGATGCTTT TCCATGCCGT 

2 01 CAAAACGGCA GTGTATTGGC TGTTTGTCGG TGTCGTCCGT TTCTGCCGAA 
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251 ACTATCTGGC GCACGAATCC GAACCGGACA GGCCCGTTCC GCCTGCTTCT 

301 GCAAATCGTG CGGATGTTCC GACCGCATCC GACGGATATT CAGACAGTGG 

351 AAACGGGACG GAAGAAGCGG AAACGGAAGA AGCAGAAGCT GCGGAGGAAG 

4 01 AGGCTGCCGA TACGGAAGAC ATTGCAACTG CCGTAATCGA CAACCGCCGC 

5 4 51 ATCCCATTCG ACCGGAGTAT TGCTGAAGGG TTGATGCCGT CTGAAAGCGA ^ 

501 AATTTCGCCC GTCCGTCCGG TTTTTAAGGA AATCACTTTG GAAGAAGCAA 

551 CGCGTGCTTT AAACAGCGCG GCTTTAAGGG AAACGAAAAA ACGCTATATC 

601 GATGCATTTG AGAAAAACGA AACAGCGGTC CCCAAAGTCC GCGTGTCCGA 

651 TACCCCGATG GAAGGGCTGC AGATTATCGG TTTGGACGAC CCTGTGCTTC 

10 701 AACGCACGTA TTCCCGTATG TTCGATGCGG ACAAAGAAGC GTTTTCCGAG 

751 TCTGCGGATT ACGGATTTGA GCCGTATTTT GAGAAGCAGC ATCCGTCTGC 

801 CTTTTCTGCA GTCAAAGCCG AAAATGCACG GAATGCGCCG TTCCGCCGTC 

851 ATGCAGGGCA GGGNAAAGGG CAGGCGGAGG CNAAATCCCC GGATGTTTCC 

901 CAAGGGCAGT CCGTTTCAGA CGGCACAGCC GTCCGCGATG CCNGCCGCCG 

15 951 CGTTTCCGTC AATTTGAAAG AACCGAACAA GGCAACGGTT TCTGCGGAGG 

1001 CGCGGATTTC GCGCCTGATT CCGGAAAGTC GGACGGTTGT CGGGAAACGG 

1051 GATGTCGAAA TGCCGTCTGA AACCGAAAAT GTTTTCACGG AAANTGTTTC 

1101 GTCTGTGGGA TACGGCGNTC CGGTTTATGA TGAAACTGCC GATATCCATA 

1151 TTGAAGAACC TGCCGCGCCC GATGCTTGGG TGGTCGAACC ACCCGAAGTG 

20 1201 CCGAAAGTTC CCATGCCCGC AATNGATATT CCGCCGCCGC CTCCCGTATC 

1251 GGAAATCTAC AACCGTACCT ATGAACCGCC GGCAGGATTC GAGCAGGTGC 

1301 AACGCAGCCG CATTGCCGAA ACCGATCATC TTGCCGATGA TGTTTTGAAT 

13 51 GGAGGTTGGC AGGAGGAAAC CGCCGCTATT GCGAATGACG GCAGTGAGGG 

14 01 TGTGGCAGAG CGGTCAAGCG GGCAATATTT GTCGGAAACC GAAGCGTTCG 
25 1451 GGCATGACAG TCAGGCGGTT TGTCCGTTTG AAAATGTGCC GTCTGAACGC 

1501 CCGTCCCGCC GGGCATNGGA TACGGAAGCG GATGAAGGGG CGTTCCAATC 

1551 TGAAGAAACC GGTGCGGTAT CCGAACACCT GCCGACAACC GACCTGCTTC 

1601 TGCCGCCGCT GTTCAATCCC GGGGCGACGC AAACCGAAGA AGANCTGTTG 

1651 GANAACAGCA TCACCATCGA AGAAAAATNG . GCGGAGTTCA AAGTCAAGGT 

30 1701 CAAGGTTGTC GATTCTTATT CCGGCCCCGT GATTACGCGT TATGAAATCG 

1751 AACCCGATGT CGGCGTGCGC GGCAATTCCG TTCTAAATCT GGAAAAAGAN 

1801 TTGGCGCGTT CGCTCGGCGT GGCTTCCATC CGCGTTGTCG AAACCATCCT 

1851 CGGCAAAACC TGTATGGGTT TGGAACTTCC GAACCCGAAA CGCCAAATGA 

1901 TACGCCTGAG CGAAATCTTC AATTCGCCCG AGTTTGCCGA ATCCAAATCC 

35 1951 AAGCTGACGC TCGCGCTCGG TCAGGACATC ACCGGACAGC CCGTCGTAAC 

2001 CGACTTGGGC AAAGCACCGC ATTTGTTGGT TGCCGGCACG ACCGGTTCGG 

2051 GCAAATCGGT GGGTGTCAAC GCGATGATTC TGTCTATGCT TTTCAAAGCC 

2101 GCGCCGGAAG ACGTGCGTAT GATTATGATC GATCCGAAAA TGCTGGAATT 

2151 GAGCATTTAC GAAGGCATCC CGCACCTGCT CGCCCCTGTC GTTACCGATA 

40 2201 TGAAGCTGGC GGCAAACGCG CTGAACTGGT GTGTTAACGA AATGGAAAAA 

2251 CGCTACCGCC TGATGAGCTT TATGGGCGTG CGCAATCTTG CGGGTNTCAA 

2301 TCAAAAAATC GCCGAAGCCG CAGCAAGGGG GGAGAAAATC GGCAACCCGT 

2351 TCAGCCTCAC GCCCGACAAT CCCGAACCTT TGGANAAATT GCCGTTTATC 

24 01 GTGGTCGTGG TTGATGAGTT TGCCGACCTG ATGATGACGG CAGGCAAGAA 

45 24 51 AATCGAAGAA CTGATTGCCC GCCTCGCCCA AAAAGCCCGC GCGGCAGGCA 

2501 TCCATCTTAT CCTTGCCACA CAACGCCCCA GTGTCGATGT CATCACGGGT 

2551 CTGATTAAGG CGAACATCCC GACGCGTATC GCGTTCCAAG TGTCCAGCAA 

2601 AATCGACAGC CGCACGATTC TTGACCAAAT GGGTGCGGAA AACCTGCTCG 

2651 GGCAGGGCGA TATGCTGTTC CTGCCGCCGG GTACGGCCTA TCCGCAGCGC 

50 2701 GTTCACGGCG CGTTTGCCTC GGATGAAGAG GTGCACCGCG TGGTCGAATA 

2751 TCTGAAACAG TTTGGCGAAC CGGACTATGT TGACGATATN TTGAGCGGCG 

2801 GTATGTCCGA CGATTTGCTG GGAATCAGCC GGAGCGGCGA CGGCGAAACC 

2851 GATCCGATGT ACGACGAGGC CGTGTCNGTT GTTTTGAAAA CGCGCAAAGC 

2901 CAGCATTTCT GGCGTGCAGC GCGCATTGCG TATCGGCTAT AATCGCGCCG 

55 2 951 CGCGTCTGAT TGACCAGATG GAGGCGGAAG GCATTGTGTC CGCACCGGAA 

3001 CACAACGGCA ACCGTACGAT TCTCGTCCCC TTNGACAATG CTTGA 

This encodes a protein having amino acid sequence [<SEQ ID 492>] (SEP ID NO: 492) : 



1 MFWIVLIVIL LLALAGLFFV RAQSEREWMR EVSAWQEKKG EKQAELPEIK 
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10 



15 



20 



51 DGMPDFPELA 

101 ANRADVPTAS 

151 IPFDRSIAEG 

201 DAFEKNETAV 

251 SADYGFEPYF 

301 QGQSVSDGTA 

351 DVEMPSETEN 

4 01 PKVPMPAXDI 

451 GGWQEETAAI 

501 PSRRAXDTEA 

551 XNSITIEEKX 

601 LARSLGVASI 

651 KLTLALGQDI 

701 APEDVRMIMI 

751 RYRLMSFMGV 

801 WWDEFADL 

851 LIKANIPTRI 

901 VHGAFASDEE 

951 DPMYDEAVSV 

1001 HNGNRTILVP 



LMLFHAVKTA VYWLFVGWR 



DGYSDSGNGT 
LMPSESEISP 
PKVRVSDTPM 
EKQHPSAFSA 
VRDAXRRVSV 
VFTEXVSSVG 
PPPPPVSEIY 
ANDGSEGVAE 
DEGAFQSEET 
AEFKVKVKW 
RWETILGKT 
TGQPWTDLG 
DPKMLELSIY 
RNLAGXNQKI 
MMTA GKKIEE 
AFQVSSKIDS 
VHRWEYLKQ 
VLKTRKASIS 
XDNA* 



EEAETEEAEA 
VRPVFKEITL 
EGLQIIGLDD 
VKAENARNAP 
NLKEPNKATV 
YGXPVYDETA 
NRTYEPPAGF 
RSSGQYLSET 
GAVSEHLPTT 
DSYSGPVITR 
CMGLELPNPK 
KAPHLLVAGT 
EGIPHLLAPV 
AEAAARGEKI 
LIARLAQKAR 
RTILDQMGAE 
FGEPDYVDDX 
GVQRALRIGY 



FCRNYLAHES 
AEEEAADTED 
EEATRALNSA 
PVLQRTYSRM 
FRRHAGQGKG 
SAEARISRLI 
DIHIEEPAAP 
EQVQRSRIAE 
EAFGHDSQAV 
DLLLPPLFNP 
YEIEPDVGVR 
RQMIRLSEIF 
TGSGKSVGVN 
VTDMKLAANA 
GNPFSLTPDN 
AAGIHLILAT 
NLLGQGDMLF 
LSGGMSDDLL 
NRAARLIDQM 



EPDRPVPPAS 

IATAVIDNRR 

ALRETKKRYI 

FDADKEAFSE 

QAEAKSPDVS 

PESRTWGKR 

wDAWWEPPEV 

TDHLADDVLN 

CPFENVPSER 

GATQTEEXLL 

GNSVLNLEKX 

NSPEFAESKS 

AMILSMLFKA 

LNWCVNEMEK 

PEPLXKLPFI 

QRPSVDVITG 

LPPGTAYPQR 

GISRSGDGET 

EAEGIVSAPE 



ORF58a (SEP ID NO: 492) and ORF58-1 (SEP ID NO: 490) show 96.6% identity in 1014 aa 
overlap: 



10 20 30 40 50 60 

25 orf58a.pep MFWIVLIVILLLAIAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPELA 

IIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIilMI 
orf 58-1 MFWIVLIVILLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPELA 

10 20 30 40 50 60 



70 80 90 100 110 120 

30 or f 58a . pep LMLFHAVKTAVYWLFVGWRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT 

I II III II II II III II II II I II MM Mill II II II I II II II 111 MM II II 1 1! 

orf 58 - 1 LMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVP PAS ANRADVPTAS DGYSDSGNGT 

70 80 90 100 110 . 120 



130 140 150 160 170 180 

35 orf 58a . pep EEAETEEAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMPSESEISPVRPVFKEITL . 

1 1 1 1 1 1 1! I 1 1 1 1 1 1 1 1 1 1 1 : 1 1 I Ml 1 1 1 M I 1 1 1 1 1 1 1 1 1 1 1 1 , 1 1 1 1 1 1 1 1 1 1 

orf 58-1 EEAETEEAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMPSESEISPVRPVFKEITL 

130 140 150 160 170 180 



190 200 210 220 230 240 

40 orf 58a . pep E EATRALNS AALRETKKRY I DAFE KNETAVP KVRVSDT PMEGLQ 1 1 GLDD P VLQRT YS RM 

I II II I I I I I I I II I I I I I I I I I I I I I I I II II II I I I I I I II I I I I I I I I II II I I h I 
orf 58-1 E EATRALNS AALRETKKRY I DAFEKNETAVP KVRVSDT PMEGLQ 1 1 GLDD P VLQRT YSHM 

190 200 210 220 230 240 



250 260 270 280 290 300 

45 orf 58a . pep FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFRRHAGQGKGQAEAKSPDVS 

I II I II I I II I I II I II I II I I II II I I I I II I I I II I I I M II I I I I I II I I I M II II 
orf 58 - 1 FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFHRHAGQGKGQAEAKSPDVS 

250 260 270 280 290 300 



310 320 330 340 350 360 

50 orf 58a . pep QGQSVSDGTAVRDAXRRVSVNLKEPNKATVSAEARISRLIPESRTWGKRDVEMPSETEN 

IMIMMMMII I I I I I I I I I I I I I I I I I I I I I I I I I I I E : I I I I I I I I I II I I I I I 
orf 58 - 1 QGQSVSDGTAVRDARRRVSVNLKEPNKATVSAEARISRLIPESQTWGKRDVEMPSETEN 
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310 



320 



330 



340 



350 



360 



370 380 390 400 410 420 

orf 58a . pep VFTEXVSSVGYGXPVYDETADIHIEEPAAPDAWWEPPEVPKVPMPAXDIPPPPPVSEIY 

III :|IMIM i 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 Mil 11 1 1 1 1 1 1 i I II MINIMI 

orf58-l VFTETVSSVGYGGPVYDETADIHIEEPAAPDAWWEPPEVPKVPMTAIDIQPPPPVSEIY 

370 380 390 400 410 420 



10 



430 440 450 460 470 480 

orf 58a . pep NRTYEPPAGFEQVQRSRIAETDHLADDVLNGGWQEETAAIANDGSEGVAERSSGQYLSET 

|| | | | | | : I I I I I I II I I I I II I I I I I I i I I II I I I I I I I I I : I I I I h I I I I I I I I I I I I 
orf 58 - 1 NRTYEPPSGFEQVQRSRIAETDHLADDVLNGGWQEETAAIADDGSEGAAERSSGQYLSET 
430 440 450 460 470 480 



15 



490 500 510 520 530 540 

orf 58a . pep EAFGHDSQAVCPFENVPSERPSRRAXDTEADEGAFQSEETGAVSEHLPTTDLLLPPLFNP 

1 1 1 II 1 1 1 1 1 1 1 II I II 1 1 1 1 1 I MINIMI 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II M 1: 1 1 

orf 58-1 EAFGHDSQAVCPFENVPSERPSCRVSDTEADEGAFPSEETGAVSEHLPTTDLLLPPLFNP 
490 500 510 520 530 540 



20 



550 560 570 580 590 600 

orf 58a . pep GATQTEEXLLXNSITIEEKXAEFKVKVKWDSYSGPVITRYEIEPDVGVRGNSVLNLEKX 

III III II II 1 1 Mil III MINN llllll MINIMUM II lllll I III 

orf 58-1 EATQTEEELLENSITIEEKLAEFKVKVKWDSYSGPVITRYEIEPDVGVRGNSVLNLEKD 

550 560 570 580 590 600 



25 



610 620 630 640 650 660 

orf 58a . pep LARSLGVASIRWETILGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDI 

1 1 II 1 1 N 1 1 1 II II I Mill !l 1 1 1 1 II 1 1 1 1 1 1 Ml I IMI I II M 1 1 1 MM II 

orf 58-1 LARSLGVASIRWETIPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDI 

610 620 630 640 650 660 



30 



670 680 690 700 710 720 

orf 58a . pep TGQPWTDLGKAPHLLVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIY 

I II I I I I II I I I I I I II I II I II I II I I I II I I I N II I I I II I I I I I I II II I I I II I I 
orf 58 - 1 TGQPWTDLGKAPHLL VAGTTGSGKS VGVNAM I LSMLFKAAPEDVRM IMI DPKMLELS I Y 

670 680 690 700 710 720 



35 



730 740 750 760 770 780 

orf 58a . pep EGIPHLLAPWTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGXNQKIAEAAARGEKI 

llllllllillll MM IIIIMIIIIIIIMIIMIIIIII II I WWW II III 

or f 5 8 - 1 EG I PHLLAP WTDMKLAANALNWCVNEMEKRYRLMS FMGVRNLAGFNQKI AEAAARGEKI 

730 740 750 760 770 780 



40 



790 800 810 820 830 840 

orf 58a. pep GNPFSLTPDNPEPLXKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLIIaAT 

IIIIMIMIII il'llll llllllllllllllll IIIIIMIIIIIIIIIIII 

orf 58-1 GNPFSLTPDDPEPLEKLPFIWWDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT 

790 800 810 820 830 840 



45 



850 860 870 880 890 900 

orf 58a . pep QRPS VDV I TGL I KAN I PTRI AFQVS S KI DSRT I LDQMGAENLLGQGDMLFLP PGTAYPQR 

INI lllll INI 1 1 1 II III I WWW III II III WWW W I INI II II I II IN 

orf 58-1 QRPSVDVITGLIKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLLPGTAYPQR 

850 860 870 880 890 900 



910 920 930 940 950 960 

orf 58a . pep VHGAFASDEEVHRWEYLKQFGEPDYVDDXLSGGMSDDLLGISRSGDGETDPMYDEAVSV 

1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 INI MM HI IMIIIIIII 
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orf 58 - 1 VHGAFASDEEVHRWEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDDETDPMYDEAVSV 

910 920 930 940 950 960 

970 980 990 1000 1010 

orf 58a . pep VLKTRKASISGVQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVPXDNAX 
Ml II I I I II M I I M I I I II I I I I I M I I I I I II I I I I II I I I I I I I | | | | 
orf 58-1 VLKTRKAS I SGVQRALRIGYNRAARL I DQMEAEG I VS APEHNGNRT I LVPLDNAX 

970 980 990 1000 1010 

Homology with a predicted ORF from N. gonorrhoeae 

ORF58 (SEP ID NO: 488) shows complete identity over a 9aa overlap with a predicted ORF 
(ORF58ng) fSEO ID NO: 494) from N. gonorrhoeae: 

orf 58 . pep ALMLFHAVKTAVYWLFVGWRFCRNYLAHESEPDRPVPP 103 

MINIMI 

orf 58ng SEPDRPVPPASANRADVPTASDGYSDSGNG 30 

The ORF58ng nucleotide sequence [<SEQ ID 493>] (SEP ID NO: 493) is predicted to encode a 
protein having partial amino acid sequence [<SEQ ID 494>] (SEP ID NO: 494) : 

1 . .SEPDRPVPPA SANRADVPTA SDGYSDSGNG TEEAETEAAE AAEEEAADTE 

51 DIATAVIDNR RIPFDRSIAE GLMQSESKTS PVRPVFKEIT LEEATRALSS 

101 AALRETKKRY IDAFEKNGTA VPKVRVSDTP MEGLQIIGLD DPVLQRTYSR 

151 MFDADKEAFS ESADYGFEPY FEKQHPSAFS AVKAENARNA PFRRHAGQEK 

2 01 GQAEAKSPDV SQGQSVSDGT AVRDARRRVS VNLKEPNKAT VSAEARISRL 
251 IPESRTWGK RDVEMPSETE NVFTETVSSV GYGGPVYDEA ADIHIEEPAA 

3 01 PDAWWEPPE VPEVAVPEID ILPPPPVSEI YNRTYEPPAG FEQAQRSRIA 
351 ETDHLAADVL NGGWQEETAA IADDGSEGAA ERSSGQYLSE TEAFGHDSQA , 

4 01 VCPFEDVPSE RPSCRVSDTE ADEGAFQSEE TGAVSEHLPT TDLLLPPLFN 
451 PEATQTEEEL LENSITIEEK LAEFKVKVKV VDSYSGPVIT RYEIEPDVGV 
501 RGNSVLNLEK DLARSLGVAS IRWETIPGK TCMGLELPNP KRQMIRLSEI 
551 FNSPEFAESK SKLTLALGQD ITGQPWTDL GKAPHLLVA G TTGSGKSV GV 
601 NAMILSMLFK AAPEDVRMIM IDPKMLELSI YEGITHLLAP WTDMKLAAN 
651 ALNWCVNEME KRYRLMSFMG VRNLAGFNQK IAEAAARGEK IGNPFSLTPD 
701 DPEPLE KLPF IWWDEFAD LMMTA GKKIE ELIARLAQKA RAAGIHLILA 
751 TQRPSVDVIT GLIKANIPTR IAFQVSSKID SRTILDQMGA ENLLGQGDML 
801 FLPPGTAYPQ RVHGAFASDE EVHRWEYLK QFGEPDYVDD ILSGGGSEEL 
851 PGIGRSGDGE TDPMYDEAVS WLKTRKASI SGVQRALRIG YNRAARLIDQ 
901 MEAEGIVSAP EHNGNRTILV PLDNA* 

This partial gonococcal sequence contains a predicted transmembrane region and a predicted 
ATP/GTP-binding site motif A (P-loop; double underlined). Furthermore, it has a domain 
homologous to the FTSK cell division protein of E. coll Alignment of ORF58ng (SEP ID NP: 
494) and FtsK (accession number p46889) (SEP IDNP: 1142) show a 65 % amino acid identity in 
459 overlap: 



ORF58ng: 467 IEEKIAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKDLARSLGVASIRVVET 526 

+E +LA+F++K W+ GPVITR+E+ GV+ + NL +DLARSL ++RWE 
FtsK: 868 VEARLADFRIKADVWYSPGPVITRFELNLAPGVKAARISNLSRDLARSLSTVAVRVVEV 927 
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ORF58ng: 527 



FtsK: 



928 



IPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDITGQPWTDLGKAPHL 586 
IPGK +GLELPN KRQ + L E+ ++ +F ++ S LT+ LG+DI G+PW DL K PHL 
IPGKPYVGLELPNKKRQTVYLREVLDNAKFRDNPSPLTWLGKDIAGEPWADLAKMPHL 987 



ORF58ng : 587 LVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIYEGITHLLAPWTDMK 646 

LVAGTTGSGKS VGVNAM I LSML + KA PEDVR IMIDPKMLELS+YEGI HLL WTDMK 
FtsK: 988 LVAGTTGSGKSVGVNAMILSMLYKAQPEDVRFIMIDPKMLELSVYEGIPHLLTEWTDMK 104 7 



ORF58ng: 64 7 



FtsK: 



LAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKIGNPFSLTPDDPEP- - 704 
AANAL WCVNEME+RY+LMS +GVRNLAG+N+KIAEA I +P+ D + 

1048 DAANALRWCVNEMERRYKIjMSALGVRNLAGYNEKIAEADRMMRPIPDPYWKPGDSMDAQH 1107 



ORF58ng: 705 



FtsK: 



-LEKLPFIWWDEFADLMMTAGKKIEELIARLAQKARAAGIHLILATQRPSVDVITGL 762 
L+K P+IW+VDEFADLMMT GKK+EELIARLAQKARAAGIHL+LATQRPSVDVITGL 
1108 PVLKKEPYIWLVDEFADLMMTVGKKVEELIARLAQKARAAGIHLVLATQRPSVDVITGL 1167 



ORF58ng : 763 IKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQRVHGAFASDEEV 822 

IKANIPTRIAF VSSKIDSRTILDQ GAE+LLG GDML+ P + P RVHGAF D+EV 
FtsK: 1168 IKANIPTRIAFTVSSKIDSRTILDQAGAESLLGMGDMLYSGPNSTLPVRVHGAFVRDQEV 1227 

ORF58ng : 823 HRWEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDGETDPMYDEAVSWLKTRKASISG 882 

H W+ K G P YVD IS SE G G G E DP++D+AV V + RKASISG 
FtsK: 1228 HAWQDWKARGRPQYVDGITSDSESEGGAG-GFDGAEELDPLFDQAVQFVTEKRKASISG 1286 

ORF58ng: 883 VQRALR I GYNRAARL IDQMEAEG I VS APEHNGNRT I LVP 921 

VQR RIGYNRAAR+I+QMEA+GIVS HNGNR +L P 
FtsK: 1287 VQRQFRIGYNRAARI IEQMEAQGIVSEQGHNGNREVLAP 1325 

Further work on ORF58ng revealed the complete gonococcal DNA sequence to be [<SEQ ID 
495>] (SEP ID NO: 495) : 



1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
' 651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 



ATGTTTTGGA 
GTTTTTTGTC 
CGTGGCAGGA 
GACGGTATGC 
CAAAACGGCA 
ACTATCTGGC 
GCAAACCGTG 
AAACGGGACG 
AGGCTGCCgA 
ATCCcatTCG 
AACTTCGCCC 
CGCGTGCTTT 
GATGCATTTG 
TACCCCGATG 
AACGCACGTA 
TCTGCGGATT 
CTTTTCTGCA 
ATGCAGGGCA 
CAAGGGCAGT 
CGTTTCCGTC 
CGCGGATTTC 
GATGTCGAAA 
GTCTGTGGGA 
TTGAAGAGCC 
CCGGAGGTAG 
GGAAATCTAC 
AACGCAGCCG 
GGAGGTTGGC 



TAGTTTTGAT 
CGCGCACAAT 
AAAGAAAGGG 
CCGATTTTCC 
GTGTATTGGC 
GCACGAATCC 
CGGATGTTCC 
GAAGAAGCGG 
TACgGAAGAC 
ACCGGAGTAT 
GTCCGTCCGG 
AAGCAGCGCG 
AGAAAAACGG 
GAAGGGCTGC 
TTCCCGTATG 
ACGGATTTGA 
GTCAAAGCCG 
GGAGAAAGGG 
CCGTTTCAGA 
AATTTGAAAG 
GCGCCTGATT 
TGCCGTCTGA 
TACGGCGGTC 
TGCCGCGCCC 
CCGTACCCGA 
AACCGTACCT 
CATTGCCGAA 
AGGAGGAAAC 



CGTTATtgtg 
CCGAACGCGA 
GAAAAACAGG 
CGAGTTTTCC 
TGTTTGTCGG 
GAACCGGACA 
GACCGCATCC 
AAACGGAAGC 
ATTGCAACTG 
TGCTGAAGGG 
TTTTTAAGGA 
GCTTTAAGGG 
AACAGCCGTC 
AGATTATCGG 
TTTGATGCGG 
GCCGTATTTT 
AAAATGCACG 
CAGGCGGAGG 
CGGCACAGCC 
AACCGAACAA 
CCGGAAAGTC 
AACCGAAAAT 
CGGTTTATGA 
GATGCTTGGG 
AATCGATATT 
ATGAGCCGCC 
ACCGACCATC 
CGCCGCTATT 



TTGCTTGCGC 
GTGGATGCGC 
CGGAGCTGCC 
CTGATGCTTT 
TGTCGTCCGT 
GGCCCGTTCC 
GACGGGTATT 
AGCAGAAGCT 
CCGTAATCGA 
TTGATGCAGT 
AATCACTTTG 
AAACGAAAAA 
CCCAAAGTAC 
TTTGGACGAC 
ACAAAGAAGC 
GAGAAGCAGC 
GAATGCGCCG 
CAAAATCCCC 
GTCCGCGATG 
GGCAACGGTT 
GGACGGTTGT 
GTTTTCACGG 
TGAAGCTGCC 
TGGTCGAACC 
CTGCCGCCGC 
GGCAGGATTC 
TTGCCGCTGA 
GCAGATGACG 



TTGCCGGCCT 
GAGGTTTCTG 
TGAAATCAAA 
TCCATGCCGT 
TTCTGCCGAA 
GCCTGCTTCT 
CAGACAGTGG 
GCGGAGGAAG 
CAACCGCCGC 
CTGAAAGCAA 
GAAGAAGCAA 
ACGCTATATC 
GCGTGTCCGA 
CCTGTGCTTC 
GTTTTCCGAG 
ATCCGTCTGC 
TTCCGCCGTC 
GGATGTTTCC 
CCCGCCGCCG 
TCTGCGGAGG 
CGGGAAACGG 
AAACCGTTTC 
GATATCCATA 
ACCCGAAGTG 
CTCCCGTATC 
GAGCAGGCGC 
TGTTTTGAAT 
GCAGTGAGGG 
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1401 TGCGGCAGAG CGGTCAAGCG GGCAATATCT GTCGGAAACC GAAGCGTTCG 

1451 GGCATGACAG TCAGGCGGTT TGTCCGTTTG AAGATGTGCC GTCTGAACGC 

1501 CCGTCCTGCC GGGTATCGGA TACGGAAGCG GATGAAGGGG CGTTCCAATC 

1551 GGAAGAGACC GGTGCGGTAT CCGAACACCT GCCGACAACC GACCTGCTTC 

5 1601 TGCCTCCGCT GTTCAATCCC GAGGCGACGC AAACCGAAGA AGAACTGTTG 

1651 GAAAACAGCA TCACCATCGA AGAAAAATTG GCGGAGTTCA AAGTCAAGGT 

1701 CAAGGTTGTC GATTCTTATT CCGGCCCCGT GATTACGCGT TATGAAATCG 

1751 AACCCGATGT CGGCGTGCGC GGCAATTCCG TTCTGAATTT GGAAAAAGAC 

1801 TTGGCGCGTT CGCTCGGCGT GGCTTCCATC CGCGTTGTCG AAACCATCCC 

10 1851 CGGCAAAACC TGCATGGGTT TGGAACTTCC GAACCCGAAA CGCCAAATGA 

1901 TACGCCTGAG CGAAATTTTC AATTCGCCCG AGTTTGCCGA ATGCAAATCC 

1951 AAGCTGACGC TCGCGCTCGG TCAGGACATT ACCGGACAGC CCGTCGTAAC 

2001 CGACTTGGGC AAAGCACCGC ATTTGCTGGT TGCCGGCACG ACCGGTTCGG 

2 051 GCAAATCGGT GGGTGTCAAC GCGATGATTC TGTCTATGCT TTTCAAAGCC 

15 2101 GCGCCGGAAG ACGTGCGTAT GATTATGATC GATCCGAAAA TGCTGGAATT 

2151 GAGCATTTAC GAAGGCATCA CGCACCTGCT CGCCCCTGTC GTTACCGATA 

2201 TGAAGCTGGC GGCAAACGCG CTGAACTGGT GTGTTAACGA AATGGAAAAA 

2251 CGCTACCGCC TGATGAGCTT TATGGGCGTG CGCAATCTTG CGGGCTTCAA 

2301 CCAAAAAATC GCCGAAGCCG CAGCAAGGGG AGAAAAAATC GGCAATCCGT 

20 2351 TCAGCCTCAC GCCCGACGAT CCCGAACCTT TGGAAAAACT GCCGTTTATC 

24 01 GTGGTCGTGG TCGATGAGTT TGCCGATTTG ATGATGACGG CAGGCAAGAA 

24 51 AATCGAAGAA CTGATTGCGC GCCTCGCCCA AAAAGCCCGC GCGGCAGGCA 

2501 TCCACCTTAT CCTTGCCACA CAACGCCCCA GCGTCGATGT CATCACGGGT 

2 551 CTGATTAAGG CGAACATCCC GACGCGTATC GCGTTCCAAG TGTCCAGCAA 

25 2601 AATCGACAGC CGCACGATTC TCGACCAAAT GGGCGCGGAA AACCTGCTCG 

2651 GTCAGGGCGA TATGCTGTTC CTGCCGCCGG GTACTGCCTA TCCGCAGCGC 

2 701 GTTCACGGCG CGTTTGCCTC GGATGAAGAG GTGCACCGCG TGGTCGAATA 

2 751 TCTGAAGCAG TTTGGCGAGC CGGACTATGT TGACGATATT TTGAGCGGCG 

2 801 GCGGCAGCGA AGAGCTGCCC GGCATCGGGC GCAGCGGCGA CGGCGAAACC 

30 2851 GATCCGATGT ACGACGAGGC CGTATCCGTT GTCCTGAAAA CGCGCAAAGC 

2 901 CAGCATTTCG GGCGTACAGC GCGCCTTGCG CATCGGCTAC AACCGCGCCG 

2 951 CGCGTCTGAT TGACCAAATG GAAGCGGAAG GCATTGTGTC CGCACCGGAA 

3001 CACAACGGCA ACCGTACGAT TCTCGTCCCC TTGGACAATG CTTGA 

35 This corresponds to the amino acid sequence [<SEQ ID 496; ORF58ng-l>] (SEP ID NO: 496; 
ORF58ng-l) : 



1 MFWIVLIVIV LLALAGLFFV RAQS EREWMR EVSAWQEKKG EKQAELPEIK 

51 DGMPDFPEFS LM L FHAVKTA VYWLFVGWR FCRNYLAHES EPDRPVPPAS 

101 ANRADVPTAS DGYSDSGNGT EEAETEAAEA AEEEAADTED IATAVIDNRR 

40 151 IPFDRSIAEG LMQSESKTSP VRPVFKEITL EEATRALSSA ALRETKKRYI 

2 01 DAFEKNGTAV PKVRVSDTPM EGLQIIGLDD PVLQRTYSRM FDADKEAFSE 
251 SADYGFEPYF EKQHPSAFSA VKAENARNAP FRRHAGQEKG QAEAKSPDVS 
301 QGQSVSDGTA VRDARRRVSV NLKEPNKATV SAEARISRLI PESRTWGKR 

3 51 DVEMPSETEN VFTETVSSVG YGGPVYDEAA DIHIEEPAAP DAWWEPPEV 
45 401 PEVAVPEIDI LPPPPVSEIY NRTYEPPAGF EQAQRSRIAE TDHLAADVLN 

4 51 GGWQEETAAI ADDGSEGAAE RSSGQYLSET EAFGHDSQAV CPFEDVPSER 
501 PSCRVSDTEA DEGAFQSEET GAVSEHLPTT DLLLPPLFNP EATQTEEELL 
551 ENSITIEEKL AEFKVKVKW DSYSGPVITR YEIEPDVGVR GNSVLNLEKD 
601 LARSLGVASI RWETIPGKT CMGLELPNPK RQMIRLSEIF NSPEFAESKS 

50 651 KLTLALGQDI TGQPWTDLG KAPHLLVAGT TGSGKSVGVN AMILSMLFKA 

701 APEDVRMIMI DPKMLELSIY EGITHLLAPV VTDMKLAANA LNWCVNEMEK 

751 RYRLMSFMGV RNLAGFNQKI AEAAARGEKI GNPFSLTPDD PEPLE KLPFI 

801 VWVDEFADL MMT AGKKIEE LIARLAQKAR AAGIHLILAT QRPSVDVITG 

851 LIKANIPTRI AFQVSSKIDS RTILDQMGAE NLLGQGDMLF LPPGTAYPQR 

55 901 VHGAFASDEE VHRWEYLKQ FGEPDYVDDI LSGGGSEELP GIGRSGDGET 

951 DPMYDEAVSV VLKTRKASIS GVQRALRIGY NRAARLIDQM EAEGIVSAPE 

1001 HNGNRTILVP LDNA* 
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ORF58ng-l (SEP ID NO: 496) and ORF58-1 CSEQ ID NO: 490) show 97.2% identity in 1014 aa 
overlap: 

10 20 30 40 50 60 

orf 58-1 .pep MFWIVLIVILLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPELA 

5 I : I i 1 1 1 1 1 1 1 1 1 1 1 L 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ^ 

orf 58ng-l MFWIVLIVIVLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPEFS 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 58- 1 . pep LMLFHAVKTAVYWLFVGWRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT 

10 1 1 1 1 1 1 1 1 1 1 ■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I 1 1 1 II II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 ; 

orf 58ng-l LMLFHAVKTAVYWLFVGWRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 58-1 .pep EEAETEEAEAAEEEAADTED I ATAVI DNRRI PFDRS I AEGLMPS ESE I S PVRP VFKE I TL 

15 MINI II 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 U 1 1 llh llllllllllll 

orf 58ng-l EEAETEAAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMQSESKTSPVRPVFKEITL 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 58-1 .pep EEATRALNSAALRETKKRYIDAFEKNETAVPKVRVSDTPMEGLQIIGLDDPVLQRTYSHM 

20 | | | | | || : | | | | | | | | | | | | | | | | | | | | | | | | | | | | I I II M I I I I I I I I I I II I I h I 

orf 58ng-l EEATRALSSAALRETKKRYIDAFEKNGTAVPKVRVSDTPMEGLQIIGLDDPVLQRTYSRM 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 58-1 .pep FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFHRHAGQGKGQAEAKSPDVS 

25 | || | | | | | | | | || | | | | | | | | | | | | M | | | | | | | | | | | | | | : I I I I I I I I I I I I I I I I I 

orf 58ng-l FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFRRHAGQEKGQAEAKSPDVS 
250 260 270 280 290 300 

310 320 330 340 350 360 

orf 58 - 1 . pep QGQSVSDGTAVRDARRRVSVNLKEPNKATVSAEARISRLIPESQTWGKRDVEMPSETEN 

30 | | | | | | | | | | | | | | | | | | | || | | | | | | | | | | | | | | | | | | | | I I : I I I I I I I I I I I I I I I I 

orf 58ng-l QGQSVSDGTAVRDARRRVSVNLKEPNKATVSAEARISRLIPESRTWGKRDVEMPSETEN 
310 320 330 340 350 360 

370 380 390 400 410 420 

orf 58-1 .pep VFTETVSSVGYGGPVYDETADIHIEEPAAPDAWWEPPEVPKVPMTAIDIQPPPPVSEIY 

35 | Ml MM Ml MINIM! I III III I II II II III II hi = III 1 1 III INI 

orf 58ng-l VFTETVSSVGYGGPVYDEAADIHIEEPAAPDAWWEPPEVPEVAVPEIDILPPPPVSEIY 

370 380 390 400 410 420 

430 440 450 460 470 480 

orf58-l .pep NRTYEPPSGFEQVQRSRIAETDHLADDVLNGGWQEETAAIADDGSEGAAERSSGQYLSET 

40 | | | | | | | : | || | : | | | | | | | | | | | | | | | | | | | | | | | | I I | I I I I I II I I I II I I I I I I I 

orf58ng-l NRTYEPPAGFEQAQRSRIAETDHLAADVLNGGWQEETAAIADDGSEGAAERSSGQYLSET 
430 440 450 460 470 480 

490 500 510 520 530 540 

orf58-l .pep EAFGHDSQAVCPFENVPSERPSCRVSDTEADEGAFPSEETGAVSEHLPTTDLLLPPLFNP 

45 || | | | | | | | | | | | | : | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I | | | I I I I I I I I 

orf 58ng-l EAFGHDSQAVCPFEDVPSERPSCRVSDTEADEGAFQSEETGAVSEHLPTTDLLLPPLFNP 
490 500 510 520 530 540 
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550 560 570 580 590 600 

orf 58-1 .pep EATQTEEELLENSITIEEKLAEFKVKVKWDSYSGPVITRYEIEPDVGVRGNSVLNLEKD 

IMIMIII MM IIIIIIIMIII IMMIIIIIIMMIIIII IIIMIIMIIIII 

orf58ng-l EATQTEEELLENS I T I EEKLAEFKVKVKVVDS YSGPVI TRYE I EPDVGVRGNS VLNLEKD 

550 560 570 580 590 600 



610 620 630 640 650 660 

orf58-l .pep LARSLGVAS I RWET I PGKTCMGLELPNPKRQM I RLSE I FNS PEFAES KS KLTLALGQD I 

I I I I I I I I I I I I I I I I M I I II I I I I I ! I I I II I I II I I I I I I II I I I I I I I I I I I 
orf 58ng-l LARSLGVAS 1 RWET I PGKTCMGLELPNPKRQM I RLSE I FNS PEFAES KS KLTLALGQD I 
10 610 620 630 640 650 660 



15 



670 680 690 700 710 720 

orf 58-1 .pep TGQP WTDLGKAPHLLVAGTTGSGKS VGVNAM I LSMLFKAAPED VRM IM I DPKMLELS I Y 

1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 M 1 1 1 1 1 1 1 1 1 1! M 1 1 1 i 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 ! 

orf58ng-l TGQPWTDLGKAPHLLVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIY 

670 680 690 700 710 720 



20 



730 740 750 760 770 780 

orf 58-1. pep EGI PHLI^PWTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKI 

Ml I I I I I I I I I I I I I I I I I I M I I I I I I M I I I I I I I I I I I I I I I I I I ■ I I I I I I 
orf 58ng-l EGITHLI^PWTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKI 

730 740 750 760 770 780 



25 



790 800 810 820 830 840 

orf 58- 1 . pep GNPFSLTPDDPEPLEKLPFIWWDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT 

' IMIMIII I II 1 1 IIIIIIIMIII Mill III II MM MM I II II 1 1 Mil I MM 

orf 58ng-l GNPFSLTPDDPEPLEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT 

790 800 810 820 830 840 



30 



850 860 870 880 890 900 

orf 58-1 .pep QRPSVDVITGLI KANI PTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLLPGTAYPQR 

II I II I II II I I M I I I I II I I II I I I I I II I I I I I II I I I I II I I I II II Mill 
orf58ng-l QRPSVDVITGLIKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQR 

850 860 870 880 890 900 



35 



910 920 930 940 950 960 

orf 58 - 1 . pep VHGAFASDEEVHRWEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDDETDPMYDEAVSV 

II II II Nil II II ill II II II II II MM' II II II II II I Mill MM II 

orf 58ng-l VHGAFASDEEVHRWEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDGETDPMYDEAVSV 

910 920 930 940 950 960 



970 980 990 1000 1010 

or f 5 8 - 1 . pep VLKTRKAS I SGVQRALR I GYNRAARL I DQMEAEG I VS APEHNGNRT I LVPLDNAX 

I I I I I I I I I II II I II I I II I II I i I I I I I I I II II II II I M II I I I I I I I I II I 
orf 58ng-l VLKTRKAS I SGVQRALR I GYNRAARL I DQMEAEG I VS APEHNGNRT I LVPLDNAX 

40 970 980 990 1000 1010 

Furthermore, ORF58ng-l fSEO ID NO: 496) shows significant homology to the Exoli protein 
FtsK (SEP ID NO: 1 142) : 



sp|P46889 |FTSK_ECOLI CELL DIVISION PROTEIN FTSK ) gi | 16514 12 | gnl | PID | dl015290 (Dl 
45 division protein FtsK [Escherichia coli] ) gi | 1651418 | gnl | PID | dl015296 (D90727) Cell 

division protein FtsK [Escherichia coli] )gi| 1787117 (AE000191) cell division 
protein FtsK [Escherichia coli] Length = 1329 
Score = 576 bits (1469), Expect = e-163 

Identities = 301/459 (65%), Positives = 353/459 (76%), Gaps = 5/459 (1%) 
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Query: 556 IEEKIAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKDLARSLGVASIRVVET 615 

+E +LA+F++K W+ GPVITR+E+ GV+ + NL +DLARSL ++RWE 
Sbjct: 868 VEARLADFRIKADVVNYSPGPVITRFELNLAPGVKAARISNLSRDLARSLSTVAVRVVEV 927 

Query : 616 IPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDITGQPWTDLGKAPHL 675 
5 IPGK +GLELPN KRQ + L E+ ++ +F ++ S LT+ LG+DI G+PW DL K PHL 

Sbjct: 928 IPGKPYVGLELPNKKRQTVYLREVLDNAKFRDNPSPLTVVLGKDIAGEPVVADLAKMPHL 987 

Query: 676 LVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIYEGITHLLAPWTDMK 735 

LVAGTTGSGKS VGVNAM I LSML+ KA PEDVR IMIDPKMLELS+YEGI HLL WTDMK 
Sbjct : 988 LVAGTTGSGKSVGVNAMILSMLYKAQPEDVRFIMIDPKMLELSVYEGIPHLLTEWTDMK 1047 

10 Query: 736 LAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKIGNPFSLTPDDPEP- - 793 

AANAL WCVNEME+RY+LMS +GVRNLAG+N+KIAEA I +P+ D + 

Sbjct: 1048 DAANALRWCVNEMERRYKLMSALGVRNLAGYNEKIAEADRMMRPIPDPYWKPGDSMDAQH 1107 

Query: 794 - -LEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILATQRPSVDVITC 851 
L+K P+IW+VDEFADLMMT GKK+EELIARLAQKARAAGIHL+LATQRPSVDVITGL 
15 Sbjct: 1108 PVLKKEPYIWLVDEFADLMMTVGKKVEELIARLAQKARAAGIHLVLATQRPSVDVITGL 1167 

Query : 852 IKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQRVHGAFASDEEV 911 

IKANIPTRIAF VSSKIDSRTILDQ GAE+LLG GDML+ P + P RVHGAF D+EV 
Sbjct: 1168 IKANIPTRIAFTVSSKIDSRTILDQAGAESLLGMGDMLYSGPNSTLPVRVHGAFVRDQEV 1227 

Query : 912 HRWEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDGETDPMYDEAVSWLKTRKASISG 971 
20 H W+ K G P YVD IS SE G G G E DP++D+AV V + RKASISG 

Sbjct: 1228 HAWQDWKARGRPQYVDGITSDSESEGGAG-GFDGAEELDPLFDQAVQFVTEKRKASISG 1286 

Query: 972 VQRALR I GYNRAARL I DQMEAEGI VS APEHNGNRT I LVP 1010 

VQR RIGYNRAAR+I+QMEA+GIVS HNGNR +L P 
Sbjct: 1287 VQRQFRI GYNRAARI I EQMEAQG I VS EQGHNGNREVLAP 1325 

25 Based on this analysis, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 59 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 497>] (SEP ID 
NO: 497) : 

30 1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG 

51 CATTTTCGTC GTCCTCTTGG CGGTATTGGT CTCCACGCAG GCAATCAACC 
101 TGCTCGGCCG TGCCGCCGAC GGGC . . GTGA TCGCCATCGA TGCCGTGTTG 

151 GCATTGGTCG GCTTCTGGGT C 

// 

35 901 A TTGCCATCGG TTTGTTTTTA ATTTACCAAA ACGGGCTGAC 

951 CCTGCTTTTT GAAGCCGTGG AAGACGGCAA AATCCATTTT TGGCTCGGAC 
1001 TGCTGCCTAT GCACATTATC ATGTTTGTCC TTGCACTCAT CCTGTTGCGC 
1051 GTCCGCAGTA TGCCCAGCCA GCCCTTCTGG CAGGCGGTTG GCAAAAGTCT 
1101 GACATTGAAA GGCGGAAAAT GA 



40 



This corresponds to the amino acid sequence [<SEQ ID 498; ORF101>] (SEP ID NO: 498; 
ORF101) : 
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1 MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGRAAD GXVIAIDAVL 

51 ALVGFWV 

// 

301 . . . IAIGLFL IYQNGLTLLF EAVEDGKIHF WLGLLPMHII MFVLALILLR 
351 VRSMPSQPFW QAVGKSLTLK GGK* 

Further work revealed the complete nucleotide sequence [<SEQ ID 499>] (SEP ID NO: 499) : 



1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG 

51 CATTTTCGTC GTCCTCTTGG CGGTATTGGT CTCCACGCAG GCAATCAACC 

101 TGCTCGGCCG TGCCGCCGAC GGGCGTGTCG CCATCGATGC CGTGTTGGCA 

i51 TTGGTCGGCT TCTGGGTCAT CGGTATGACG CCGCTTTTGC TGGTGTTGAC 

2 01 CGCATTTATC AGTACGTTGA CCGTGTTGAC CCGCTACTGG CGCGACAGCG 

2 51 AAATGTCGGT CTGGCTATCC TGCGGATTGG CATTGAAACA ATGGATACGC 

3 01 CCGGTGATGC AGTTTGCCGT GCCGTTTGCC GTTTTGGTTG CCGTCATGCA 

3 51 GCTTTGGGTG ATACCGTGGG CAGAGCTACG CAGCCGCGAA TACGCTGAAA 

4 01 TCCTGAAGCA GAAGCAGGAA TTGTCTTTGG TGGAGGCAGG CGAGTTCAAC 
4 51 AGTTTGGGCA AGCGCAACGG CAGGGTTTAT TTTGTCGAAA CCTTCGATAC 
501 CGAATCCGGC ATCATGAAAA ACCTGTTCCT GCGCGAACAG GACAAAAACG 
551 GCGGCGACAA CATCATCTTC GCCAAAGAAG GTAACTTCTC GCTGAACGAC 
601 AACAAACGCA CGCTCGAATT GCGCCACGGC TACCGTTACA GCGGCACGCC 
651 CGGACGCGCC GACTACAATC AGGTTTCCTT CCAAAAACTC AACCTGATTA 
701 TCAGCACCAC GCCCAAACTC ATCGACCCCG TTTCCCACCG CCGTACCATT 
751 CCGACCGCCC AACTGATTGG CAGCAGCAAC CCGCAACATC AGGCGGAATT 
801 GATGTGGCGC ATCTCGCTGA CCGTCAGCGT CCTCCTACTC TGCCTGCTTG 
851 CCGTGCCGCT TTCCTATTTC AACCCGCGCA GCGGACATAC CTACAATATC 
901 TTGATTGCCA TCGGTTTGTT TTTAATTTAC CAAAACGGGC TGACCCTGCT 
951 TTTTGAAGCC GTGGAAGACG GCAAAATCCA TTTTTGGCTC GGACTGCTGC 

1001 CTATGCACAT TATCATGTTT GCCGTTGCAC TCATCCTGTT GCGCGTCCGC 

1051 AGTATGCCCA GCCAGCCCTT CTGGCAGGCG GTTGGCAAAA GTCTGACATT 

1101 GAAAGGCGGA AAATGA 

This corresponds to the amino acid sequence [<SEQ ID 500; ORF101-1>] (SEP ID NO: 500: 
ORF101-1) : 



1 MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGRAAD GRVAIDAVLA 

51 LVGFWVIGMT PLLL VLTAFI STLTVLTRYW RDSEMSVWLS CGLALKQWIR 

101 PVMQ FAVPFA VLVAVMQLWV I PWAELRSRE YAEILKQKQE LSLVEAGEFN 

151 SLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF AKEGNFSLND 

201 NKRTLELRHG YRYSGTPGRA DYNQVSFQKL NLIISTTPKL IDPVSHRRTI 

251 PTAQLIGSSN PQHQAELMWR ISLTVSVLLL CLLAVPL SYF NPRSGHTYNI 

301 LIAIGLFLIY QNGLTLLFEA VEDGKIHFWL GLLPMHIIMF AVALILLRVR 

3 51 SMPSQPFWQA VGKSLTLKGG K* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N. meningitidis (strain A) 



PRF101 (SEP ID NP: 498) shows 91.2% identity over a 57aa overlap and 95.7% identity over a 
69aa overlap with an PRF (PRFlOla) (SEP ID NP: 502) from strain A of N. meningitidis: 
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10 20 30 40 50 

orf 101 .pep MIYQRNLIKELSFTAVGIFWLLAVLVSTQAINLLGRAADGXVIAIDAVLALVGFWVX 

MIIIIIIIIIMMMIIIIIIIMI III IIMM III IIMIIIIIIIIII 

orf 101a MI YQRNLI KELS FTAVGI FWLLAVLVSTQAINLLGXAADXRX - AIDAVLALVGFWVXXM 

10 20 30 40 • 50 

// 

90 100 110 

orf 101 .pep . : IAIGLFLIYQNGLTLLFEAVEDGKIHFWLGL 

I Mill I MM Mill II II II II MM II 

orf 1 Ola LTVSVLLLCLLAVPLSYFNPRSGHTYNILXAIGLFLIYQNGLTLLFEAVEDGKIHFWLGL 
280 290 300 310 320 330 



120 130 140 150 

orf 101 .pep LPMHI IMFVLALILLRVRSMPSQPFWQAVGKSLTLKGGKX 

: I : ^ 1 1 1 i 1 1 1 1 M 1 1 M 1 1 : 1 I I I M M I I 

orf 101a LPMHI IMFVIAIVLLRVRSMPSQPFWQAVGKSLTLKGGKX 

340 350 360 370 



The complete length ORF1 01 a nucleotide sequence [<SEQ ID 501 >] (SEP ID NO: 501) is 



1 


ATGATTTATC 


AAAGAAACCT 


51 


CATTTTCGTC 


GTCCTCTTGG 


101 


TGCTCGGCCN 


TGCCGCCGAC 


151 


TTGGTCGGCT 


TCTGGGTCNN 


201 


CGCATTTATC 


AGTACGTTGA 


251 


AAATGTCGGT 


CTGGNTATCC 


301 


CCGGTGATGC 


AGTTTGCCGT 


351 


GCTTTGGGTG 


ATACCGTGGG 


401 


TCCTGAAGCA 


GAAGCAGGAA 


451 


AGTTTGGGCA 


AGCGCAACGG 


501 


CGAATCCGGC 


ATCATGAAAA 


551 


GCGGCGACAA 


CATCATCTTC 


601 


AACAAACGCA 


CGCTCGAATT 


651 


CGGACGCGCC 


GACTACAATC 


701 


TCAGCACCAC 


GCCCAAACTC 


751 


CCNACNGCCC 


AACTGATTGG 


801 


GATGTGGCGC 


ATCTCGCTGA 


851 


CCGTGCCGCT 


TTCCTATTTC 


901 


TTGANTGCCA 


TCGGTTTGTT 


951 


TTTTGAAGCC 


GTGGAAGACG 


1001 


CTATGCACAT 


CATCATGTTC 


1051 


AGCATGCCCA 


GCCAGCCCTT 


1101 


GAAAGGCGGA 


AAATGA 



This encodes a protein having amino 



CATCAAAGAA CTCTCTTTTA CCGCCGTCGG 
CGGTATTGGT CTCCACGCAG GCAATCAACC 
NGGCGTNTCG CCATCGATGC CGTGTTGGCA 
NNGNATGACG CCGCTTTTGC TNGTGTTGAC 
CCGTGTTGAC CCGCTACTGG CGNGACAGCG 
TGCGGATTGG CATTGAAACA ATGGATACGC 
GCCGTTTGCC GTTTTGGTTG CCGTCATGCA 
CAGAGCTACG CAGCCGCGAA TACGCTGAAA 
TTGTCTTTGG TGGAGGCAGG CGGGTTCAAC 
CAGGGTTTAT TTTGTCGAAA CCTTCGATAC 
ACCTGTTCCT GCGCGAACAG GACAAAAACG 
NCCAAAGAAA GTAACTTCTC GCTGAACGAC 
GCGCCACGGC TACCGTTACA GCGGCACGCC 
AGGTTTCCTT CCNAAAACTC AACCTGATTA 
ATCGACCCCG TTTCCCACCG CCGTACNATN 
CAGCAGCAAC CCGCAACATC ANGCGGAATT 
CCGTCAGCGT CCTCCTACTC TGCCTGCTTG 
AACCCGCGCA GCGGACATAC CTACAATATC 
TTTAATTTAC CAAAACGGGC TGACCCTGCT 
GCAAAATCCA TTTTTGGCTC GGACTGCTGC 
GTCATCGCAA TCGTACTTCT GCGCGTCCGC 
CTGGCAGGCG GTTGGCAAAA GTCTGACATT 

sequence [<SEQ ID 502>] (SEP ID NO: 502) : 



1 MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGXAAD XRXAIDAVLA 

51 LVGFWVXXMT PLLL VLTAFI STLTVLTRYW RDSEMSVWXS CGLALKQWIR 

101 PVMQ FAVPFA VLVAVMQLWV I PWAELRSRE YAEILKQKQE LSLVEAGGFN 

151 SLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF XKESNFSLND 

201 NKRTLELRHG YRYSGTPGRA DYNQVSFXKL NLIISTTPKL IDPVSHRRTX 

251 PTAQLIGSSN PQHXAELMWR ISLTVSVLLL CLLAVPL SYF NPRSGHTYNI 

301 LXAIGLFLIY QNGLTLLFEA VEDGKIHFWL GLLPMHIIMF VIAIVLLRVR 

351 SMPSQPFWQA VGKSLTLKGG K* 



CHIR-0160 (356.001) 



-383- 



PATENT 



ORFlOla (SEO ID NO: 502) and ORF101-1 (SEP ID NO: 500) show 95.4% identity in 371 aa 
overlap: 

orf lOla.pep MIYQRNLIKELSFTAVGIFWLLAVLVSTQAINLLGXAADXRXAIDAVLALVGFWVXXMT 60 

I I I I I I I I I I I I I I I I I II II I I I I I I Ml II II I Ml I IIMIIIIIIIII II 
5 orf 101-1 MIYQRNLIKELSFTAVGIFWLLAVLVSTQAINLLGRAADGRVAIDAVLALVGFWVIGMT 60 

orf 101a . pep PLLLVLTAFISTLTVLTRYWRDSEMSVWXSCGLALKQWIRPVMQFAVPFAVLVAVMQLWV 120 

I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 Ml IMIIMII MIMIMM MIMIMI 

orf 10 1 - 1 PLLLVLTAF I STLTVLTRYWRDS EMS VWLS CGLALKQW I RPVMQFAVP FAVLVAVMQLWV 120 

orf 101a .pep IPWAELRSREYAEILKQKQELSLVEAGGFNSLGKRNGRVYFVETFDTESGIMKNLFLREQ 180 

10 M II 1 1 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 II I IIMMIMI MIIIIIIMMIMIII I 

orf 101-1 IPWAELRSREYAEILKQKQELSLVEAGEFNSLGKRNGRVYFVETFDTESGIMKNLFLREQ 180 

orf lOla.pep DKNGGDNI I FXKESNFSLNDNKRTLELRHGYRYSGTPGRADYNQVSFXKLNLI ISTTPKL 240 

IIIIIMIII I I : I I I I I I I I I I I I I I I I I I I I I i I I I I ! I I I I I I IMIIMII I 
orf 101-1 DKNGGDNI I FAKEGNFSLNDNKRTLELRHGYRYSGTPGRADYNQVSFQKLNLI ISTTPKL 240 

15 orf 101a .pep IDPVSHRRTXPTAQLIGSSNPQHXAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI 300 

IMIIMII IMMIII III II 1 1 1 II 1 1 1 IM II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 

orf 101-1 IDPVSHRRT I PTAQL IGSSNPQHQAELMWRI SLTVS VLLLCLLAVPLS YFNPRSGHTYNI 300 

orf 101a .pep LXAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHI IMFVIAIVLLRVRSMPSQPFWQA 360 

I 1 1 1 1 1 1 1 1 1 M II M I M II 1 1 1 1 1 1 1 II 1 1 Ml 1 1 -I -1 1 1 1 1 M I M 1 1 1 

20 orf 101-1 LIAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHI IMFAVALILLRVRSMPSQPFWQA 360 

orf 101a. pep VGKS LTLKGGK 3 71 
IMMIII 

orf 101-1 VGKS LTLKGGK 3 71 

Homology with a predicted ORF from N. gonorrhoeae 

25 ORF101 (SEO ID NO: 498) shows 96.5 % identity in 57aa overlap at the N-terminal domain and 
95.1% identity in 61aa overlap at the C-terminal domain, respectively, with a predicted ORF 
(ORFlOlng) (SEO ID NO: 504) from N. gonorrhoeae: 

orf 101 .pep M I YQRNL I KELS FTAVG I FWLLAVLVSTQAINLLGRAADGXVI A I DAVLALVGFWV 57 

I I II I II I II II I II I II II M II II I I II I II II I II I II I I I II I I I I I II II 
30 orf lOlng M I YQRNL I KELS FTAVG I F WLLA VL VS TQ A I NLLGRAADGR V - A I DAVLALVGFWV I GM 59 

// 

orf 101 .pep I A I GLFL I YQNGLTLLFEAVEDGKI H FWLG 333 

I I I I II I I I I I I I I I IM i I M I I I I I I 
.orf lOlng SLTVSVLLLCLLAVPLSYFNPRSGHTYNILIAIGLFL I YQNGLTLLFEAVEDGKI HFWLG 331 

35 orf 101 .pep LLPMHIIMFVLALILLRVRSMPSQPFWQAVGKSLTLKGGK 373 

I MIMMMMMIMM IMMIMM 

orf lOlng LLPMHI IMFVIAIVLLRVRSMPSQPFWQAVG 362 
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The ORFlOlng nucleotide sequence [<SEQ ID 503>] (SEP ID NO: 503) is predicted to encode a 
protein having partial amino acid sequence [<SEQ ID 504>] (SEP ID NO: 504) : 

1 MIYORNLIKE LSFTAVGIFV V LLAVLVSTQ AINLLGRAAD GRVAIDA VLA 

51 LVGFWVIGMT PLLL VLTAFI STLTVLTRYW RDSEMSVWLS CGLALKQWIR 

101 PVMQ FAVPFA ILIAVMQLWV I PWAELRSRE YAEILKQKQE LSLVEAGEFN 

151 NLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF AKEGNFSLKD 

201 NKRTLELRHG YRYSGTPGRA DYNQVSFQKL NLIISTTPKL IDPVSHRRTI 

251 STAQLIGSSN PQHQAELMWR ISLTVSVLLL CLLAVPL SYF NPRSGHTYNI 

301 LIAIGLFLIY QNGLTLLFEA VEDGKIHFWL GLLPMHIIMF VIAIVLLRVR 

351 SMPSQPFWQA VG. . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 505>] (SEP ID NO: 505) : 

1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG 

51 CATTTTCGTC GTCCTCTTGG CGGTGTTGGT GTCCACGCAG GCGATCAACC 

101 TGCTTGGCCG CGCAGCTGAC GGGCGTGTCG CCATCGATGC CGTGTTGGCC 

151 TTAGTCGGCT TCTGGGTCAT CGGTATGACC CCGCTTTTGC TGGTGTTGAC 

2 01 CGCATTCATC AGCACGCTGA CCGTATTGAC CCGCTACTGG CGCGACAGCG 
251 AAATGTCGGT CTGGCTATCC TGCGGATTGG CGTTGAAACA GTGGATACGC 
301 CCCGTCATGC AGTTTGCCGT GCCGTTTGCC ATCCTGATTG CCGTCATGCA 

3 51 GCTTTGGGTG ATACCGTGGG CAGAGCTGCG CAGCCGCGAA TATGCCGAAA 

4 01 TTTTGAAGCA GAAGCAGGAA TTGTCTTTGG TGGAAGCCGG CGAGTTCAAT 
4 51 AACTTGGGCA AGCGCAACGG CAgggtttaT TtcgtcgaaA CCTTTGACAC 
501 CGaatccgGC ATCATGAAAA ACCTGTtcct GcGCGAACAG GACAAAAACG 
551 gcggcgacaA CATCATCTTC GCcaaaGAag gtaactTctc gctgaaggaC 
601 AACAAAcgca cgctcgaATT GCGCCACGGC TACCGTTACA GCGGcacgcC . 
651 CGGacGCGCc gactaCAATC AGGTTtcctt cCAAAAacTc aacctgATta 
701 TCAGCACCAC GCCCAAacTT ATCGaccCCG TTTCCCACCG CCGCACCATT 
751 tcgacCGCCC AAcTGATTGG CAGCAGCAAT CCGCAACATC AGGCAGAATT 
801 GATGTGGCGC ATCTCGCTGA CCGTCAGCGT CCTCCTGCTC TGCCTACTCG 
851 CCGTGCCGCT TTCCTATTTC AACCCGCGCA GCGGACATAC CTACAATATC 
901 TTGATTGCCA TCGGTTTGTT TTTAATTTAC CAAAACGGGC TGACCCTGCT 
951 TTTTGAAGCC GTGGAAGACG GCAAAATCCA TTTTTGGCTC GGACTGCTGC 

1001 CTATGCACAT CATCATGTTC GTCATCGCAA TCGTACTTCT GCGCGTCCGC 

1051 AGTATGCCCA GCCAGCCCTT CTGGCAGGCG GTTGGCAAAA GTCTGACATT 

1101 GAAAGgcgGA AAATGA 

This corresponds to the amino acid sequence [<SEQ ID 506; PRF101ng-l>] (SEP ID NO: 506; 
PRFlOlng-1) : 



1 MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGRAAD GRVAIDAVLA 

51 LVGFWVIGMT PLLL VLTAFI STLTVLTRYW RDSEMSVWLS CGLALKQWIR 

101 PVMQ FAVPFA ILIAVMQLWV I PWAELRSRE YAEILKQKQE LSLVEAGEFN 

151 NLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF AKEGNFSLKD 

201 NKRTLELRHG YRYSGTPGRA DYNQVSFQKL NLIISTTPKL IDPVSHRRTI 

251 STAQLIGSSN PQHQAELMWR ISLTVSVLLL CLLAVPL SYF NPRSGHTYNI 

301 LIAIGLFLIY QNGLTLLFEA VEDGKIHFWL GLLPMHIIMF VIAIVLLRVR 

351 SMPSQPFWQA VGKSLTLKGG K* 

ORF101ng-l (SEP ID NO: 506) and PRF101-1 (SEP ID NP: 500) show 97.6% identity in 371 aa 



overlap: 
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10 20 30 40 50 60 

or f 101-1. pep MI YQRNLI KELSFTAVGI F WLLAVLVS TQA I NLLGRAADGRVA I DAVLALVGFWV I GMT 

1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II I ! 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1! I M 1 1 

orf 101ng-l M I YQRNL I KELS FT AVG I F WLLAVLVS TQA I NLLGRAADGRVA I DAVLALVGFWV I GMT 

10 20 30 40 50 60 



10 



15 



20 



25 



30 



70 80 90 100 110 120 

orf 101-1. pep PLLLVLTAFI STLTVLTRYWRDSEMSVWLSCGLALKQWIRPVMQFAVPFAVLVAVMQLWV 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I h h I . I i I 
orf 101ng-l PLLLVLTAFI STLTVLTRYWRDSEMSVWLSCGLALKQWIRPVMQFAVPFAILIAVMQLWV 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 101-1 .pep IPWAELRSREYAEILKQKQELSLVEAGEFNSLGKRNGRVYFVETFDTESGIMKNLFLREQ 

Ml Illlllllll I Mill Mill II II I hi I I II I II II I II Ml I I Illlllllll I 
orf 101ng-l IPWAELRSREYAEILKQKQELSLVEAGEFNNLGKRNGRVYFVETFDTESGIMKNLFLREQ 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 101-1 .pep DKNGGDNIIFAKEGNFSLNDNKRTLELRHGYRYSGTPGRADYNQVS FQKLNLIISTTPKL 

! 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 = 1 1 ! I M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 E 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 101ng-l DKNGGDNI I FAKEGNFSLKDNKRTLELRHGYRYSGTPGRADYNQVS FQKLNLIISTTPKL 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 101-1. pep IDPVSHRRTIPTAQLIGSSNPQHQAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI 

Illlllllll II IIIIIIIMMMIMI IIIIIIIIIIMIMIIIIIII MM' 

orf 101ng-l IDPVSHRRTISTAQLIGSSNPQHQAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI 

250 260 270 280 290 300 

310 320 330 340 350 360 

orf 101-1. pep LI AIGLFLI YQNGLTLLFEAVEDGKIHFWLGLLPMHI I M FAVAL I LLRVRSM P SQ P FWQA 

I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I- I- I I I I I M I I I I I 
orf 101ng-l LIAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHIIMFVIAIVLLRVRSMPSQPFWQA 

310 320 330 340 350 360 



35 



orf 101-1 .pep 
orf 101ng-l 



370 

VGKSLTLKGGKX 

MIMMIIMI 
VGKSLTLKGGKX 
370 



40 



Based on this analysis, including the presence of a putative leader sequence (double-underlined) 
and several putative transmembrane domains (single-underlined) in the gonococcal protein, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 60 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 507>] (SEP ID 
NO: 507) : 
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1 . . GGTGGTGGTT TTATCAATGC TTCCTGTGCC ACTTTGACGA CAGCCAAACC 

51 GCAATATCAA GCAGGAGACC TTAGCGCTTT TAAGATAAGG CAAGGCAATG 

101 TTGTAATCGC CGGACACGGT TTGGATGCAC GTGATACCGA TTACACACGT 

151 ATTCTCAGTT ATCATTCCAA AATCGATGCA CCCGTATGGG GACAAGATGT 

201 TCGTGTCGTC GCGGGACAAA ACGATGTGGC CGCAACAGGT GATGCACATT 

251 CGCCTATTCT CAATAATGCT GCTGCCAATA CGTCAAACAA TACAGCCAAC 

3 01 AACGGCACAC ATATCCCTTT ATTTGCGATT GATACAGGCA AATTAGGAGG 

3 51 TAT . GTATGC CAACAAAATC ACCTTGATCA GTACGGTCGA GCAAGCAGGC 

4 01 ATTCGTAA 

This corresponds to the amino acid sequence [<SEQ ID 508; ORF113>] (SEP ID NO: 508; 
ORF113) : 

1 ..GGGFINASCA TLTTAKPQYQ AGDLSAFKIR QGNWIAGHG LDARDTDYTR 

51 ILSYHSKIDA PVWGQDVRW AGQNDVAATG DAHSPILNNA AANTSNNTAN 

15 101 NGTHIPLFAI DTGKLGGXVC QQNHLDQYGR ASRHS* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with with pspA putative secreted protein (SEO ID NO: 1143) of N.meninsitidis 
(accession AF030941) 

20 ORF (SEO ID NO: 508) and pspA (SEO ID NO: 1143) show 44% aa identity in 1 79aa overlap: 

orf 113 GGGF I NASCATLTTAKPQYQAGDLSAFK I RQGNW I AGHGLDARDTDYTR ILSYHSKIDA 60 

GGG INA+ TLT+ P G+L+ F + G WI G GLD D DYTRILS + + I+A 
pspa GGGLINAASVTLTSGVPVLNNGNLTGFDVSSGKWIGGKGLDTSDADYTRILSRAAEINA 256 

or f 1 1 3 PVWGQDVRWAGQNDVAATGDAHS PI LXXXXXXXXXXXXXXGTHI PLFAIDTGKLGGMYA 120 
25 VWG+DV+W+G+N + G + P AIDT LGGMYA 

pspa GVWGKDVKWSGKNKLDFDG SLAKTASAPSSSDSVTPTVAIDTATLGGMYA 307 

or f 1 1 3 NKITLIS TVEQAG I RNQGQWFAS AGNVAVNAEGKLVNTGM I AATGENHAVS LHARNVHN 179 

+KITLIST A IRN+G+ FA+ G V ++A+GKL N+G I A ++ + A+ V N 

pspa DKI TLI STDNGAVIRNKGRI FAATGGVTLS ADGKLSNSGS IDAA EITISAQTVDN 362 

30 Homology with a predicted ORF from N. gonorrhoeae 

ORF113 (SEO ID NO: 508) shows 86.5% identity in 52aa overlap at the N- terminal part and 
94.1% identity in 17aa overlap at the C-terminal part with a predicted ORF (ORF1 13ng) (SEO ID 
NO: 510) from N. gonorrhoeae: 



orf 113 GGGFINASCATLTTAKPQYQAGDLSAFKIR 30 

35 I I I I I I I I I I I I I I I I I I I I : I : I I I I 

orf 113ng SHPSQLNGYIEVGGRRAEWIANPAGIAVNGGGFINASRATLTTGQPQYQAGDFSGFKIR 224 

orf 113 QGNW I AGHGLDARDTDYTR I LS YHSKIDAPVWGQDVRWAGQNDVAATGDAHS P I LNNA 9 0 

I M I I I M I I M I M I I 
orf 113ng QGNAVIAGHGLDARDTDFTRILVCQQNHLDQYGRTSRHS 263 
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orf 113 



IDTGKLGGXVCQQNHLDQYGRASRHS 



135 



orf 113ng 




The complete length ORF113ng nucleotide sequence [<SEQ ID 509>] (SEP ID NO: 509) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 510>] (SEP ID NO: 510) : 



1 MNKTLYRVIF NRKRGAWAV AETTKREGKS CADSGSGSVY VKSVSFIPTH 

51 SKAFCFSALG FSLCLALGTV NIAFADGIIT DKAAPKTQQA TILQTGNGIP 

101 QVNIQTPTSA GVSVNQYAQF DVGNRGAILN NSRSNTQTQL GGWIQGNPWL 

151 TRGEARVWN QINSSHPSQL NGYIEVGGRR AEWIANPAG IAVNGGGFIN 

201 ASRATLTTGQ PQYQAGDFSG FKIRQGNAVI AGHGLDARDT DFTRILVCQQ 

251 NHLDQYGRTS RHS* 



Based on this analysis, it is predicted that these proteins from N. meningitidis and N. gonorrhoeae, 
and their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 51 1>] (SEP ID 



1 . . TCAACGGGAC ATAGCGAACA AAATTACACT TTGCCGCGAG AAATCACACG 

51 CAACATTTCA CTGGGTTCAT TTGCCTATGA ATCGCATCGC AAAGCATTAA 

101 GCCATCATGC GCCCAGCCAA GGCACTGAGT TGCCGCAAAG CAACGGTATT 

151 TGGCTACCCT ATACGTCCAA TTCTTTTACC CCATTACCCA GCAGCAGCTT 

201 ATACATTATC AATCCTGTCA ATAAAGGCTA TCTTGTTGAA ACCGATCCAC 

251 GCTTTGCCAA CTACCGTCAA TGGTTGGGTA GTGACTATAT GCtGGACAGC 

301 CTCAAACTAG ACCCAAACAA TTTACATAAA CGTTTGGGTG ATGGTTATTA 

351 CGAGCAACGT TTAATCAATG AACAAATCGC AGAGCTGACA GGGCATCGTC 

4 01 GTTTAGAcGG TTATCAAAAC GACGAAGAAC AATTTAAAGC CTTAATGGAT 

4 51 AATGGCGCGA CTGCGGCACG TTcGATGAAT CTCAGCGTTG GCATTGCATT 

501 AAGTGCCGAG CAAGTAGCGC AACTGACCAG CGATATTGTT TGGTTGGTAC 

551 AAAAAGAAGT TAAGCTTCCT GATGGCGGCA CACAAACCGT ATTGGTGCCA 

601 CAGGTTTATG TACGCGTTAA AAATGGCGAC ATAGACGGTA AAGGTGCATT 

651 GTTGTCAGGC AGCAATACAC AAATCAATGT TTCAGGCAGC CTGAAAAACT 

701 CAGGCACGAT TGCAGGgCGC AATGCGCTTA TTATCAATAC CGATACGCTA 

751 GACAATATCG GTGGGCGTAT TCATGCGCAA AAATCAGCGG TTACGGCCAC 

801 ACAAGACATC AATAATATTG GCGGCATGCT TTCTGCCGAA CAGACATTAT 

851 TGCTCAACGC AGGCAACAAC ATCAACAGCC AAAGCACCAC CGCCAGCAGT 

901 CAAAATACAC AAGGCAGCAG CACCTACCTA GACCGAATGG CAGGTATTTA 

951 TATCACAGGC AAAGAAAAAG GTGTTT . . 



This corresponds to the amino acid sequence [<SEQ ID 512; PRF115>] (SEP ID NO: 512: 



1 . . STGHSEQNYT LPREITRNIS LGSFAYESHR KALSHHAPSQ GTELPQSNGI 

51 SLPYTSNSFT PLPSSSLYII NPVNKGYLVE TDPRFANYRQ WLGSDYMLDS 

101 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD 

151 NGATAARSMN LSVGIALSAE QVAQLTSDIV WLVQKEVKLP DGGTQTVLVP 



Example 61 



NO: 511) : 



ORF115) : 
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201 QVYVRVKNGD IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL 
251 DNIGGRIHAQ KSAVTATQDI NNIGGMLSAE QTLLLNAGNN INSQSTTASS 
301 QNTQGSSTYL DRMAGIYITG KEKGV. . 

5 Computer analysis of this amino acid sequence gave the following results: 

Homology with the pspA putative secreted protein (SEP ID NO: 1143) of N. meningitidis 
(accession number AF030941) 

ORF1 15 (SEP ID NO: 512) and pspA protein (SEP ID NO: 1143) show 50% aa identity in 325aa 
overlap: 

10 Orfll5: 1 STGHSEQNYTLPREITRNISLGSFAYESHRKALSHHAPSQGTELPQSNGISLPYTSNS FT 60 

STG+S Y " E+ + +1 +G AY+ + + P + NGI +T 
STGYSRSPYEPAPEVS -SI RMGI S AYKGYAPQQASDI PGTWPWAENGIHPTFT 831 

PLPSSSLYIINPVNKGYLVETDPRFANYRQWLGSDYMLDSLKLDPNNLHKRLGDGYYEQR 120 
LP+SSL+ I P NKGYL+ETDP F +YR+WLGS YML +L+ DPN++HKRLGDGYYEQ+ 
15 pspA: 832 -LPNSSLFAIAPNNKGYLIETDPAFTDYRKWLGSGYMLAALQQDPNHIHKRLGDGYYEQK 890 

LINEQI AELTGHRRLDGYQNDEEQFKALMDNGATAARSMNLSVGI ALSAEQVAQLTSDI V 180 
L+NEQIA+LTG+RRLDGY NDEEQFKALMDNG T A+ + L+ GIALSAEQVA+LTSDIV 



25 



Orf 115: 


1 


pspA: 


778 


Orf 115: 


61 


pspA: 


832 


Orf 115: 


121 


pspA: 


891 


Orf 115: 


181 


pspA: 


951 


Orf 115: 


240 


pspA: 


1010 


Orf 115: 


300 


pspA: 


1069 



20 WL + V LPDG TQTVL P+VYVR + D++G+GALLSGS I SG+++N G I AG 



R ALI+N + N+ G + + A DI N G + AE LLL A 



+ R+AGIY+TG++ G 



Homology with a predicted ORF from N. gonorrhoeae 

PRF115 (SEP ID NP: 512) shows 91.9% identity over a 334aa overlap with a predicted PRF 
30 (PRF1 1 5ng) fSEPIDNP: 514) from N. gonorrhoeae: 

orf 115 .pep STGHSEQNYTLPREITRNISLGSFAYESHRK 31 

III IIIIMIIhlilllllll I I 
orf 115ng NEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDISLGSFAYESHSK 71 

orf 115 .pep ALSHHAPSQGTELPQSN GISLPYTSNSFTPLPSSSLYI INPVNKGYLVET 81 

35 1 1 1:||| || Ml Mill ' II M II I 1 1 1 1 1 1 1 : 1 1 1 1 1 M M 1 1 U 

orf 115ng ALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNSFTPLPGSSLYIINPANKGYLVET 131 
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orf 115 .pep DPRFANYRQWLGSDYMLDSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND 141 

Illllllllllllllll III I I I i I I I I I I 

or f 1 1 5ng DPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND 191 

orf 115. pep EEQFKALMDNGATAARSMNLSVGI ALSAEQVAQLTSDIVWLVQKEVKLPDGGTQTVLVPQ 201 

5 Ml 1 1 1| || || MM II I MM II I II llh II I Mill II I II III MM IN II hi I 

orf 1 1 5ng EEQFKALMDNGATAARSMNLSVGI ALSAEQAAQLTSDI VWLVQKEVKLPDGGTQTVLMPQ 251 

orf 115. pep VYVRVKNGD IDGKGALLSGSNTQINVSGSLKNSGTIAGRNALI INTDTLDNIGGRIHAQK 261 

1 1 1 II I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 

orf 115ng VYVRVKNGG I DGKGALLSGSNTQ I NVSGSLKNSGT I AGRNAL I INTDTLDNIGGRIHAQK 311 

10 orf 115 .pep SAVTATQDINNIGGMLSAEQTLLLNAGNNINSQSTTASSQNTQGSSTYLDRMAGIYITGK 321 

IIIIIIIIIIIIIMIIIII I IMIIIMIh I I I I : I I I I I I ! II I I I I I I I I I 
orf 115ng SAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTYLDRMAGIYITGK 3 71 

orf 115. pep EKGV 325 
MM 

15 orf 115ng EKGVLAAQAGKDINI IAGQI SNQSDQGQTRLQAGRDINLDTVQTGKYQEIHFDADNHTIR 431 

An ORF1 15ng nucleotide sequence [<SEQ ID 513>] (SEP ID NO: 513) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 514>] (SEP ID NO: 514) : 

1 MLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT 

20 51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI 

101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS 

151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD 

201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP 

251 Q VYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL 

25 3 01 DNIGGRIHAQ KS AVTATQD I NNIGGILSAE QTLLLNAGNN INNQSTAKSS 

351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT 

4 01 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL 

4 51 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG 

501 GNKLVITDKA QSHHETAQSS TFEGKQWLQ AGNDANILGS NVISDNGTRI 

30 551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS 

601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ 

651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ AIAVAHKAAK QFDKAKTTAL 

701 MPWRLPMQVG RLFKQAKAPK K* 

35 Further work revealed the following partial gonococcal DNA sequence [<SEQ ID 515>] (SEP ID 
NP: 515) : 

1 TTGCTTGTGC AAACAGAAAA AGACGGTTTG CATAACGAGC AAACCTTTGG 

51 CGAGAAGAAA GTCTTCAGCG AAAATGGTAA GTTGCACAAC TACTGGCGTG 

101 CGCGTCGTAA AGGACATGAT GAAACAGGGC ATCGTGAACA AAATTATACT 

40 151 TTGCCGGAGG AAATCACACG CGACATTTCA CTGGGTTCAT TTGCCTATGA 

201 ATCGCATAGC AAAGCATTAA GCCGTCATGC GCCCAGCCAA GGCACTGAGT 

251 TGCCACAAAG TAACCGGGAT AATATCCGTA CTGCGAAAAG CAACGGTATT 

301 TCGCTACCCT ATACGCCCAA TTCTTTTACC CCATTACCCG GCAGCAGCTT 

3 51 ATACATTATC AATCCTGCCA ATAAAGGCTA TCTTGTTGAA ACCGATCCAC 
45 401 GCTTTGCCAA CTACCGTCAA TGGTTGGGTA GTGACTATAT GCTGGGCAGC 

4 51 CTCAAACTAG ACCCAAACAA TTTACATAAA CGTTTGGGTG ATGGTTATTA 
501 CGAGCAACGT TTAATCAATG AACAAATCGC AGAGCTGACA GGGCATCGTC 
551 GTTTAGACGG TTATCAAAAC GACGAAGAAC AATTTAAAGC CTTAATGGAT 
601 AATGGCGCGA CTGCGGCACG TTCGATGAAT CTCAGCGTTG GCATTGCATT 

50 651 AAGTGCCGAG CAAGCAGCGC AACTGACCAG CGATATTGTT TGGTTGGTAC 
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701 AAAAAGAAGT TAAACTTCCT GATGGCGGCA CACAAACCGT ATTGATGCCA 

751 CAGGTTTATG TACGCGTTAA AAATGGCGGC ATAGACGGTA AAGGTGCATT 

801 GTTGTCAGGC AGCAATACAC AAATCAATGT TTCAGGCAGC CTGAAAAACT 

851 CAGGCACGAT TGCAGGGCGC AATGCGCTTA TTATCAATAC CGATACGCTA 

901 GACAATATCG GTGGGCGTAT TCATGCGCAA AAATCAGCGG TTACGGCCAC 

951 ACAAGACATC AATAATATTG GCGGCATTCT TTCTGCCGAA CAGACATTAT 

1001 TGCTCAATGC GGGTAACAAC ATCAACAACC AAAGCACGGC CAAGAGCAGT 

1051 CAAAATGCAC AAGGTAGCAG CACCTACCTA GACCGAATGG CAGGTATTTA 

1101 TATCACAGGC AAAGAAAAAG GTGTTTTAGC AGCGCAGGCA GGCAAAGACA 

1151 TCAACATCAT TGCCGGTCAA ATCAGCAATC AATCAGATCA AGGGCAAACC 

1201 CGGCTGCAGG CAGGACGCGA CATTAACCTG GATACGGTAC AAACCGGCAA 

1251 ATATCAAGAA ATCCATTTTG ATGCCGATAA CCATACCATC CGAGGTTCAA 

1301 CGAACGAAGT CGGCAGCAGC ATTCAAACAA AAGGCGATGT TACCCtatTG 

13 51 TCAGGGAATA ATCTCAATGC CAAAGCTGCC GAAGTCGGCA GCGCAAAAGG 

14 01 CACACTTGCC GTGTATGCTA AAAATGACAT TACTATCAGC TCAGGCATCC 
14 51 ATGCCGGCCA AGTTGATGAT GCGTCCAAAC ATACAGGCAG AAGCGGCGGC 
1501 GGTAATAAAT TAGTCATTAC CGATAAAGCC CAAAGTCATC ACGAAACTGC 
1551 TCAAAGCAGC ACCTTTGAAG GCAAGCAAGT TGTATTGCAG GCAGGAAACG 
1601 ATGCCAACAT CCTTGGCAGT AATGTTATTT CCGATAATGG CACCCGGATT 
1651 CAAGCAGGCA ATCATGTTCG CATTGGTACA ACCCAAACTC AAAGCCAAAG 
1701 CGAAACCTAT CATCAAACCC AAAAATCAGG ATTGATGAGT GCAGGTATCG 
1751 GCTTCACTAT TGGCAGCAAG ACAAACACAC AAGAAAACCA ATCCCAAAGC 
1801 AACGAACATA CAGGCAGTAC CGTAGGCAGC CTGAAAGGCG ATACCACCAT 
1851 TGTTGCAAGC AAACACTACG AACAAACCGG CAGCAACGTT TCCAGCCCTG 
1901 AGGGCAACAA CCTTATCAGC ACGCAAAGTA .TGGATATTGG CGCAGCACAA 
1951 AACCAATTAA ACAGCAAAAC CACCCAAACC TACGAACAAA AAGGCTTAAC 
2001 GGTGGCATTC AGTTCGCCCG TTACCGATTT GGCACAACAA GCGATTGCCG 
2051 TAGCACACAA AGCAGCAAAC AAGTCGGACA AAGCAAAAAC GACCGCGTTA 
2101 ATGCCATGGC GGCTGCCAAT GCAGGTTGGC AGGCCTATCA AACAGGCAAA 
2151 GGCGCACAAA ACTTAG 

This corresponds to the amino acid sequence [<SEQ ID 516; ORF115ng-l>] (SEP ID NO: 516: 
ORF115ng-l) : 



1 LLVQTEKDGL HNEQTFGEKK VFSENGKXHN YWRARRKGHD ETGHREQNYT 

51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI 

101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS 

151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD 

2 01 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP 

251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL 

301 DNIGGRIHAQ KS AVTATQD I NNIGGILSAE QTLLLNAGNN INNQSTAKSS 

351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT 

401 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL 

451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG 

501 GNKLVITDKA QSHHETAQSS TFEGKQWLQ AGNDANILGS NVISDNGTRI 

551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS 

601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ 

651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ A I AVAHKAAN KSDKAKTTAL 

701 MPWRLPMQVG RP I KQAKAHK T* 

This gonococcal protein (PRF115ng-l) (SEP ID NO: 516) shows 91.9% identity with ORF115 
(SEP ID NO: 512) over 334aa: 



20 30 40 50 60 70 

orf 115ng- 1 . p NEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEE I TRDI SLGS FAYESHSK 

III M 1 1 M M 1 1 M i 1 1 ! I i M I I 

orf 115 STGHSEQNYTLPREITRNISLGSFAYESHRK 
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10 20 30 

80 90 • 100 110 120 130 

orf 115ng-l.p ALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNS FTPLPGSSLYI INPANKGYLVET 

II hi I I I I M I I I I I I IIIMII I I I ! MM I I I I I I I : I I I I I I I 

5 orf 115 ALSHHAPSQGTELPQSN GISLPYTSNSFTPLPSSSLYIINPVNKGYLVET 

40 50 60 70 80 

140 150 160 170 180 190 

orf 115ng- 1 . p DPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IIIMII MIMMIIIII MINI IIIIIIIIIIMI 

1 0 orf 115 DPRFANYRQWLGSDYMLDSLKLDPNNLHKRLGDGYYEQRLINEQI AELTGHRRLDGYQND 

90 100 110 120 130 140 

200 210 220 230 240 250 

orf 115ng-l.p EEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDIVWLVQKEVKLPDGGTQTVLMPQ 
1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 ■ I I I I I I I I I I h I I I I I I I ' I I M I I I I I M I I I I h 
1 5 orf 115 EEQFKALMDNGATAARSMNLSVGIALSAEQVAQLTSDIVWLVQKEVKLPDGGTQTVLVPQ 

150 160 170 180 190 200 

260 270 280 290 300 310 

orf 115ng-l .p VYVRVKNGGIDGKGALLSGSNTQINVSGSLKNSGTIAGRNALIINTDTLDNIGGRIHAQK 

II II III I Mill 1 1 llllll II IIIMII MM II IIIIIIIIIIMI II 1 1 II MM 

20 orf 115 VYVRVKNGDIDGKGALLSGSNTQINVSGSLKNSGTIAGRNALIINTDTLDNIGGRIHAQK 

210 220 230 . 240 250 260 

320 330 340 350 360 370 

orf 115ng-l.p SAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTYLDRMAGIYITGK 

MMMIMMMhllMIMMMIMIhMh MlhllMIIMIIIIIIIIII 

25 orf 115 SAVTATQDINNIGGMLSAEQTLLLNAGNNINSQSTTASSQNTQGSSTYLDRMAGIYITGK 

270 280 290 300 310 320 

380 390 400 410 420 430 

orf 115ng- 1 . p E KG VLAAQ AG KD I N I I AGQ I SNQSDQGQTRLQAGRD I NLDTVQTGKYQE IHFDADNHT I R 
MM 

30 orf 115 EKGV 

In addition, it shows homology with a secreted N. meningitidis protein (SEP ID NO: 1143) in the 
database: 

gi | 2623258 (AF030941) putative secreted protein [Neisseria meningitidis] Length = 
35 2273 

Score = 604 bits (1541), Expect = e-172 

Identities = 325/678 (47%), Positives = 449/678 (65%), Gaps = 22/678 (3%) 



40 



Query: 


1 


Sbjct : 


739 


Query: 


61 


Sbjct: 


797 


Query: 


121 


Sbjct: 


841 



LLVQTEKDGLHNEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDIS 60 
L+V T + L N++T G K + ++ G LH Y R +KG D TG+ Y E++ I 



LGSFAYESHSKALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNS FTPLPGSSLYI I 120 
+G AY+ + AP Q +++P + + NGI +T LP SSL+ I 

MGISAYKGY APQQASDIPGTV- - -VPWAENGIHPTFT LPNSSLFAI 840 



45 P NKGYL+ETDP F +YR+WLGS YML +L+ DPN++HKRLGDGYYEQ+L+NEQIA+LT 
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10 



15 



20 



25 



Query : 
Sbjct : 
Query : 
Sbjct: 
Query : 
Sbjct: 
Query : 
Sbjct: 
Query : 
Sbjct: 
Query: 
Sbjct : 
Query : 
Sbjct: 
Query : 
Sbjct: 
Query: 
Sbjct: 



181 
901 
241 
961 
300 



GHRRLDGYQNDEEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDI VWLVQKEVKLP 24 0 
G+RRLDGY NDEEQFKALMDNG T A+ + L+ GIALSAEQ A+LTSDIVWL + V LP 
GYRRLDGYTNDEEQFKALMDNGITIAKELQLTPGIALSAEQVARLTSDIVWLENETVTLP 960 

DGGTQTVLMPQVYVRVKNGG I DGKGALLSGSNTQ INVSGSLKN- SGTI AGRNALI INTDT 299 
DG TQTVL P+VYVR + ++G+GALLSGS I SG+ + +N G IAGR ALI+N 
DGTTQTVLKPKVYVRARPKDMNGQGALLSGS VVD IG - SGAI ENRGGL I AGREAL I LNAQN 1019 



LDN I GGR I HAQ KS AVT ATQD I NN I GG I LS AEQTLLLN AGNN I NNQS T AKS SQNAQGS S T Y 359 
+ N+ G + + A DI N G I AE LLL A NNI ++S +S+QN QGS 

1020 I KNLQGDLQGKN I FAAAGSD I TNTGS I - GAENALLLKASNN I ESRS ETRSNQNEQGS VRN 1078 

360 LDRMAGI YITGKEKGVLAAQAGKDINI I AGQISNQSDQGQTRLQAGRDINLDTVQTGKYQ 419 

+ R+AGIY+TG++ G + AG +1 + A +++NQS+ GQT L AG DI DT + Q 
1079 I GRVAG I YLTGRQNGS VLLDAGNN I VLTASELTNQS EDGQTVLNAGGD I RSDTTG I SRNQ 113 8 

420 E I HFDADNHT I RGS TNE VGSS IQTKGDVTLLSGNNLNAKAAEVGS AKGTLAVYAKND I T I 47 9 

FD+DN+ IR NEVGS+I+T+G+++L + + + +AAEVGS +G L + A DI + 
1139 NT I FDSDNYV I RKEQNEVGST I RTRGNLS LNAKGD I R I RAAE VGSEQGRLKLAAGRD I KV 1198 

480 S S G I HAGQ VDD AS KHTGRS GGGNKL V I TDKAQ S HHET AQ S S T FEGKQ WLQ AGNDAN I LG 53 9 

+G + +DA K+TGRSGGG K +T +++ AST +GK+++L +G D + G 
1199 EAGKAHTETEDALKYTGRSGGGIKQKMTRHLKNQNGQAVSGTLDGKEI ILVSGRDITVTG 1258 

54 0 SNVISDNGTRIQAGNHVRIGTTQTQSQSETYHQTQKSGLM-SAGIGFTIGSKTNTQENQS 598 

SN+I+DN T + A N++ + +T+S+S ++ +KSGLM S GIGFT GSK +TQ N+S 
1259 SNI I ADNHT I LSAKNNI VLKAAETRSRSAEMNKKEKSGLMGSGGIGFTAGS KKDTQTNRS 1318 

599 QSNEHTGS TVGS LKGDTT I VAS KH YEQTGSNVS S PEGNNL I S TQSMD I GAAQNQLNS KTT 658 

++ HT S VGSL G+T I A KHY QTGS +SSP+G+ IS+ + I AAQN+ + + + 
1319 ETVS HTESWGS LNGNTL I S AGKH YTQTGS T I S S PQGDVG I S S GKI S I DAAQNRYSQES K 13 78 

65 9 QTYEQKGLTVAFS S P VTD 676 

Q YEQKG+TVA S PV + 
13 7 9 Q VYEQKGVTVA I SVPWN 13 96 



30 



Based on this analysis, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 62 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 51 7>] (SEP ID 
NO: 517): 



35 



40 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 



. TCAGGGAATA 
TACACTCGCT 
ACACGACCCA 
GGCAATAAAT 
CCAAAGCAGC 
ATGCCAACAT 
CAAGCAGGCA 
CGAAACCTAT 
GCTTCACTAT 
AACGAACATA 



ACCTCAATGC 
GTGTCTGCCA 
TGTTGATGAT 
TAGTCATTAC 
ACCTTTGAAG 
CCTTGGCAGC 
ATCATGTTCG 
CATCAAACCC 
TGGCAGCAAG 
CAGGCAGTAC 



CAAAGCTGCC 
ATAATGACAT 
GCGTCCAAAC 
CGATAAAGCC 
GCAAGCAAGT 
AATGTTATTT 
CATTGGTACA 
AGAAATCAGG 
ACAAACACAC 
CGTAGGCAGC 



GAAGTCAGCA 
CAACATCAGC 
ACACAGGCAG 
CAAAGTCATC 
TGTATTGCAG 
CCGATAATGG 
ACCCAAACTC 
ATTGATGAGT 
AAGAAAACCA 
TTGAAAGGCG 



GCGCAAACGG 
GCAGGCATCA 
AAGCGGTGGT 
ACGAAACCGC 
GCAGGAAACG 
CACCCAGATT 
AAAGCCAAAG 
GCAGGTATCG 
ATCCCAAAGC 
ATACCACCAT 
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501 
551 
601 
651 



TGTTGCAGGC AAACACTACG AACAAATCGG CAGTACCGTT TCCAGCCCGG 
AAGGCAACAA TACCATCTAT GCCCAAAGCA TAGACATTCA AGCGGCACAC 
AACAAATTAA ACAGTAATAC CACCCAAACC TATGAACAAA AAGG.CTAAC 
GGTGGCATTC AGTTCGCCCG TTACCGATTT GGCACAACAA . . . 



This corresponds to the amino acid sequence [<SEQ ID 518; ORF117>] (SEO ID NO: 518; 



1 . . SGNNLNAKAA EVSSANGTLA VSANNDINIS AGINTTHVDD ASKHTGRSGG 

51 GNKLVITDKA QSHHETAQSS TFEGKQWLQ AGNDANILGS NVISDNGTQI 

101 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS 

151 NEHTGSTVGS LKGDTTIVAG KHYEQIGSTV SSPEGNNTIY AQSIDIQAAH 

201 NKLNSNTTQT YEQKXLTVAF SSPVTDLAQQ . . . 



Computer analysis of this amino acid sequence gave the following results: 

Homology with the pspA putative secreted protein (SEO ID NO: 1143) of N. meningitidis 
(accession number AF030941) 

ORF117 (SEO ID NO: 518) and pspA protein (SEO ID NO: 1143) show 45% aa identity in 224aa 



Orfll7: 4 NLNAKAAE VS S ANGTLAVS ANND INI SAG I NTTHVDD AS KHTGRS GGGNKL V I TDKAQSH 63 

++ +AAEV S GL++A DI+AG T +DA K+TGRSGGG K +T ++ 
pspA: 1173 D IR I RAAE VGS EQGRLKLAAGRD I KVEAGKAHTETEDALKYTGRSGGG I KQKMTRHLKNQ 1232 

Orf 117 : 64 HETAQS STFEGKQ WLQAGNDAN I LGSNV I SDNGTQ I QAGNHVR I GTTQTQSQSETYHQT 123 

+ AST +GK+++L +G D + GSN+I+DN T + A N++ + +T+S+S ++ 
pspA: 1233 NGQAVSGTLDGKEI ILVSGRDITVTGSNI IADNHTILSAKNNIVLKAAETRSRSAEMNKK 1292 

Orf 117 : 124 QKSGLM - SAG I GFT I GS KTNTQENQSQSNEHTGSTVGS LKGDTT I VAGKH YEQ I GS TVS S 182 

+KSGLM S GIGFT GSK +TQ N+S++ HT S VGSL G+T I AGKHY Q GST+SS 
pspA: 1293 EKSGLMGSGGI GFTAGS KKDTQTNRS ETVSHTES WGS LNGNTL I S AGKHYTQTGST I S S 1352 

Orf 117: 183 PEGNNT I YAQS ID I QAAHNKLNSNTTQTYEQKXLTVAFS S PVTD 226 

P+G+ 1+ IIAAN++ + Q YEQK +TVA S PV + 
pspA: 1353 PQGDVGISSGKISIDAAQNRYSQESKQVYEQKGVTVAISVPWN 1396 

Homology with a predicted ORF from N. gonorrhoeae 



ORF117 (SEO ID NO: 518) shows 90% identity over a 230aa overlap with a predicted ORF 
(ORF1 17ng) (SEO ID NO: 520) from N. gonorrhoeae: 



ORF117) : 



overlap: 



orf 117 .pep 



SGNNLNAKAAEVSSANGTLAVSANNDINIS 3 0 



orf 117ng 
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orfll7.pep AGINTTHVDDASKHTGRSGGGNKLVITDKAQSHHETAQSSTFEGKQWLQAGNDANILGS 90 

:||:: : I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 1 1 7ng SGIHAGQVDDASKHTGRSGGGNKLVI TDKAQSHHETAQSSTFEGKQWLQAGNDANILGS 540 

orf 117 . pep NVISDNGTQIQAGNHVRIGTTQTQSQSETYHQTQKSGLMSAGIGFTIGSKTNTQENQSQS 150 

1 1 1 i 1 1 1 1 :| 1 1 1 1 M 1 1 i 1 1 I 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 117ng NVISDNGTRIQAGNHVRIGTTQTQSQSETYHQTQKSGLMSAGIGFTIGSKTNTQENQSQS 600 

orf 117 .pep NEHTGSTVGSLKGDTTIVAGKHYEQIGSTVSSPEGNNTIYAQSIDIQAAHNKLNSNTTQT 210 

IIIIIIIIIIIIIIIIM IN Ihllllllll I :||:|| Ihhilhllll 
orf 117ng NEHTGS TVGSLKGDTT I VAS KHYEQTGSNVS S PEGNNL I STQSMD IGAAQNQLNS KTTQT 660 

orf 117. pep YEQKXLTVAFSSPVTDLAQQ 230 
I I I I I I I I I I I I I I I I I 

orf 1 1 7ng YEQKGLTVAFSSPVTDLAQQAI AVAHKAAKQFDKAKTTALMPWRLPMQVGRLFKQAKAPK 720 

An PRF1 17ng nucleotide sequence [<SEQ ID 519>] f SEP ID NO: 519) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 520>] (SEP ID NO: 520) : 

1 . . LLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT 

51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI 

101 SLPYTPNS FT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS 

151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD 

201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP 

251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL 

301 DNIGGRIHAQ KSAVTATQDI NNIGGILSAE QTLLLNAGNN INNQSTAKSS 

351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT 

4 01 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL 

451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG 

501 GNKLVITDKA QSHHETAQSS TFEGKQWLQ AGNDANILGS NVISDNGTRI 

551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS 

601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ 

651 NQLNS KTTQT YEQKGLTVAF SSPVTDLAQQ AIAVAHKAAK QFDKAKTTAL 

701 MPWRLPMQVG RLFKQAKAPK K* 

Further work revealed the following gonococcal partial DNA sequence [<SEQ ID 521>] (SEP ID 
NP: 521) : 

1 TTGCTTGTGC AAACAGAAAA AGACGGTTTG CATAACGAGC AAACCTTTGG 

51 CGAGAAGAAA GTCTTCAGCG AAAATGGTAA GTTGCACAAC TACTGGCGTG 

101 CGCGTCGTAA AGGACATGAT GAAACAGGGC ATCGTGAACA AAATTATACT 

151 TTGCCGGAGG AAATCACACG CGACATTTCA CTGGGTTCAT TTGCCTATGA 

201 ATCGCATAGC AAAGCATTAA GCCGTCATGC GCCCAGCCAA GGCACTGAGT 

2 51 TGCCACAAAG TAACCGGGAT AATATCCGTA CTGCGAAAAG CAACGGTATT 

301 TCGCTACCCT ATACGCCCAA TTCTTTTACC CCATTACCCG GCAGCAGCTT 

351 ATACATTATC AATCCTGCCA ATAAAGGCTA TCTTGTTGAA ACCGATCCAC 

4 01 GCTTTGCCAA CTACCGTCAA TGGTTGGGTA GTGACTATAT GCTGGGCAGC 

4 51 CTCAAACTAG ACCCAAACAA TTTACATAAA CGTTTGGGTG ATGGTTATTA 

501 CGAGCAACGT TTAATCAATG AACAAATCGC AGAGCTGACA GGGCATCGTC 

551 GTTTAGACGG TTATCAAAAC GACGAAGAAC AATTTAAAGC CTTAATGGAT 

601 AATGGCGCGA CTGCGGCACG TTCGATGAAT CTCAGCGTTG GCATTGCATT 

651 AAGTGCCGAG CAAGCAGCGC AACTGACCAG CGATATTGTT TGGTTGGTAC 

701 AAAAAGAAGT TAAACTTCCT GATGGCGGCA CACAAACCGT. ATTGATGCCA 

751 CAGGTTTATG TACGCGTTAA AAATGGCGGC ATAGACGGTA AAGGTGCATT 

801 GTTGTCAGGC AGCAATACAC AAATCAATGT TTCAGGCAGC CTGAAAAACT 

851 CAGGCACGAT TGCAGGGCGC AATGCGCTTA TTATCAATAC CGATACGCTA 
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901 GACAATATCG GTGGGCGTAT TCATGCGCAA AAATCAGCGG TTACGGCCAC 

951 ACAAGACATC AATAATATTG GCGGCATTCT TTCTGCCGAA CAGACATTAT 

1001 TGCTCAATGC GGGTAACAAC ATCAACAACC AAAGCACGGC CAAGAGCAGT 

1051 CAAAATGCAC AAGGTAGCAG CACCTACCTA GACCGAATGG CAGGTATTTA 

1101 TATCACAGGC AAAGAAAAAG GTGTTTTAGC AGCGCAGGCA GGCAAAGACA 

1151 TCAACATCAT TGCCGGTCAA ATCAGCAATC AATCAGATCA AGGGCAAACC 

1201 CGGCTGCAGG CAGGACGCGA CATTAACCTG GATACGGTAC AAACCGGCAA 

1251 ATATCAAGAA ATCCATTTTG ATGCCGATAA CCATACCATC CGAGGTTCAA 

1301 CGAACGAAGT CGGCAGCAGC ATTCAAACAA AAGGCGATGT TACCCtatTG 

1351 TCAGGGAATA ATCTCAATGC CAAAGCTGCC GAAGTCGGCA GCGCAAAAGG 

1401 CACACTTGCC GTGTATGCTA AAAATGACAT TACTATCAGC TCAGGCATCC 

1451 ATGCCGGCCA AGTTGATGAT GCGTCCAAAC ATACAGGCAG AAGCGGCGGC 

1501 GGTAATAAAT TAGTCATTAC CGATAAAGCC CAAAGTCATC ACGAAACTGC 

1551 TCAAAGCAGC ACCTTTGAAG GCAAGCAAGT TGTATTGCAG GCAGGAAACG 

1601 ATGCCAACAT CCTTGGCAGT AATGTTATTT CCGATAATGG CACCCGGATT 

1651 CAAGCAGGCA ATCATGTTCG CATTGGTACA ACCCAAACTC AAAGCCAAAG 

1701 CGAAACCTAT CATCAAACCC AAAAATCAGG ATTGATGAGT GCAGGTATCG 

1751 GCTTCACTAT TGGCAGCAAG ACAAACACAC AAGAAAACCA ATCCCAAAGC 

1801 AACGAACATA CAGGCAGTAC CGTAGGCAGC CTGAAAGGCG ATACCACCAT 

1851 TGTTGCAAGC AAACACTACG AACAAACCGG CAGCAACGTT TCCAGCCCTG 

1901 AGGGCAACAA CCTTATCAGC ACGCAAAGTA TGGATATTGG CGCAGCACAA 

1951 AACCAATTAA ACAGCAAAAC CACCCAAACC TACGAACAAA AAGGCTTAAC 

2 001 GGTGGCATTC AGTTCGCCCG TTACCGATTT GGCACAACAA GCGATTGCCG 

2 051 TAGCACACAA AGCAGCAAAC AAGTCGGACA AAGCAAAAAC GACCGCGTTA 

2101 ATGCCATGGC GGCTGCCAAT GCAGGTTGGC AGGCCTATCA AACAGGCAAA 

2151 GGCGCACAAA ACTTAG 

This corresponds to the amino acid sequence [<SEQ ID 522; ORF117ng-l>] (SEP ID NO: 522; 
PRF117ng-l) : 



1 LLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT 

51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI 

101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS 

151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD 

201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP 

251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL 

301 DNIGGRIHAQ KSAVTATQDI NNIGGILSAE QTLLLNAGNN INNQSTAKSS 

351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT 

401 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL 

451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG 

501 GNKLVITDKA QSHHETAQSS TFEGKQWLQ AGNDANILGS NVISDNGTRI 

551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS 

601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ 

651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ A I AVAHKAAN KSDKAKTTAL 

701 MPWRLPMQVG RPIKQAKAHK T* 

ORF1 17ng-l (SEP ID NO: 522) shows the same 90% identity over a 230aa overlap with ORF1 17 
(SEP ID NO: 518) . In addition, it shows homology with a secreted N. meningitidis protein (SEP ID 
NP: 1143) in the database: 



gi 1 2623258 (AF030941) putative secreted protein [Neisseria meningitidis] Length = 
2273 

Score = 604 bits (1541), Expect = e-172 

Identities = 325/678 (47%), Positives = 449/678 (65%), Gaps = 22/678 (3%) 
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Query: 1 LLVQTEKDGLHNEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDIS 60 

L+V T + L N++T G K + ++ G LH Y R +KG D ( TG+ Y E++ I 
Sbjct : 739 LIVGTPESALDNDETLGTKTI - TDKGDLHRYHRHHKKGRDSTGYSRSPYEPAPEVS -SIR 796 

Query: 61 LGSFAYESHSKALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNSFTPLPGSSLYII 120 
5 +G AY+ + AP Q + ++P + + NGI +T LP SSL+ I 

Sbjct: 797 MGISAYKGY APQQASDI PGTV VPWAENGIHPTFT LPNSSLFAI 840 

Query : 121 NPANKGYLVETDPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELT 180 

P NfCGYL+ETDP . F +YR+WLGS YML +L+ DPN++HKRLGDGYYEQ+L+NEQIA+LT 
Sbjct: 841 APNNKGYL I ETDP AFTD YRKWLGSGYMLAALQQDPNH I HKRLGDGYYEQKLVNEQ I AKLT 900 

10 Query: 181 GHRRLDGYQNDEEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDIVWLVQKEVKLP 240 

G+RRLDGY NDEEQFKALMDNG T A+ + L+ GIALSAEQ A+LTSDIVWL + V LP 
Sbjct : 901 GYRRLDGYTNDEEQFKALMDNGITIAKELQLTPGIALSAEQVARLTSDIVWLENETVTLP 960 

Query : 241 DGGTQTVLMPQVYVRVKNGGIDGKGALLSGSNTQINVSGSLKN- SGTIAGRNALI INTDT 299 
DG TQTVL P+VYVR + ++G+GALLSGS I SG+++N G IAGR ALI+N 
15 Sbjct: 961 DGTTQTVLKPKVYVRARPKDMNGQGALLSGS WD I G - SGAI ENRGGL I AGREAL I LNAQN 1019 

Query : 300 LDNIGGRIHAQKSAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTY 359 

+ N+ G + + A DI N G I AE LLL A NNI ++S +S+QN QGS 

Sbjct: 1020 I KNLQGDLQGKN I FAAAGSD I TNTGS I - GAENALLLKASNN I ESRSETRSNQNEQGS VRN 1078 

Query : 3 60 LDRMAG I Y I TGKE KGVLAAQAGKD INI I AGQ I SNQS DQGQTRLQAGRD I NLDTVQTGKYQ 419 
20 + R+AGIY+TG++ G + AG +1 + A +++NQS+ GQT L AG DI DT + Q 

Sbjct: 1079 IGRVAGIYLTGRQNGSVLLDAGNNIVLTASELTNQSEDGQTVLNAGGDIRSDTTGISRNQ 1138 

Query: 420 EIH FDADNHT I RGSTNEVGS S I QTKGDVTLLS GNNLNAKAAE VGS AKGTLAVYAKND I T I 47 9 

FD+DN+ IR NEVGS+I+T+G+++L + +"+ +AAEVGS +G L + A DI + 
Sbjct: 1139 NTIFDSDNYVIRKEQNEVGSTIRTRGNLSLNAKGDIRIRAAEVGSEQGRLKLAAGRDIKV 1198 

25 Query: 4 80 SSG I HAGQVDDAS KHTGRSGGGNKLVI TDKAQSHHETAQS STFEGKQ WLQAGNDAN I LG 53 9 

+G + +DA K+TGRSGGG K +T ++ + A S T +GK+ ++L +G D + G 
Sbjct: 1199 EAGKAHTETEDALKYTGRSGGG I KQKMTRHLKNQNGQAVSGTLDGKE I I LVSGRD I TVTG 1258 

Query : 540 SNV I SDNGTR I QAGNHVRI GTTQTQSQSET YHQTQKSGLM - SAG I GFTI GS KTNTQENQS 598 
SN+I+DN T + A N++ + +T+S+S ++ +KSGLM S GIGFT GSK +TQ N+S 
30 Sbjct: 1259 SN 1 1 ADNHT I LS AKNN I VLKAAETRSRSAEMNKKEKSGLMGSGG I GFTAGS KKDTQTNRS 1318 

Que ry : 5 99 QSNEHTGS TVGS LKGDTT I VAS KH YEQTGSNVS S PEGNNL I S TQSMD I GAAQNQLNS KTT 658 

++ HT S VGSL G+T I A KHY QTGS +SSP+G+ IS+ + I AAQN+ + ++ 
Sbjct: 1319 ETVSHTESWGSLNGNTLISAGKHYTQTGSTISSPQGDVGISSGKISIDAAQNRYSQESK 1378 

Query: 659 QTYEQKGLTVAFSSPVTD 676 
35 Q YEQKG+TVA S PV + 

Sbjct: 137 9 QVYEQKGVTVAISVPWN 13 96 

Based on this analysis, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 
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The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 523>] (SEP ID 
NO: 523) : 



1 ATGATTTACA TCGTACTGTT TCTAGCTGTC GTCCTCGCCG TTGTCGCCTA 

51 CAACATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG 

101 GACACTCCGA CAAAGATGCC CTGCTCAACA GCAwAACCAG CCATGTCCGC 

151 GACGGCAAAC CGTCCGGCGG GTCAGTCATG ATGCCGAAAC CCCAACCGGC 

201 GGTCAAAAAA ACGGCAAAAC CCCAAGACCC CGyCATGCGC AACCTGCAAG 

251 AACAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG 

301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAAGCGGCA TTATCGGCAA 

351 CTCCGCCCAC ACCGTTTCCG AACCCCAAAC CGGACATTCC GCAACGAAAC 

4 01 CTGCCGACGC GTCGGCAAAA CCTGCACCCG TTCCGCAAAC ACCTGCAAAA 

4 51 CCGCTGATTA CGCTCAAAGA ACTGTCAAAA GTCGAATTAT CCTGGTTTGA 

501 CGTGCGCATC GACTTCATCT CCTAT . . . 

This corresponds to the amino acid sequence [<SEQ ID 524; ORF119>] (SEP ID NO: 524; 
PRF119) : 



1 MIYIVLFLAV VLAWAYNMY QENQYRKKVR DQFGHSDKDA LLNSXTSHVR 

51 DGKPSGGSVM MPKPQPAVKfC TAKPQDPXMR NLQEQDAVYI AKQKQAKASP 

101 FKTEIETALE ESGIIGNSAH TVSEPQTGHS ATKPADASAK PAPVPQTPAK 

151 PLITLKELSK VELSWFDVRI DFISY . . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 525>] (SEP ID NP: 525) : 



1 ATGATTTACA TCGTACTGTT TCTAGCTGTC GTCCTCGCCG TTGTCGCCTA 

51 CAACATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG 

101 GACACTCCGA CAAAGATGCC CTGCTCAACA GCAAAACCAG CCATGTCCGC 

151 GACGGCAAAC CGTCCGGCGG GTCAGTCATG ATGCCGAAAC CCCAACCGGC 

201 GGTCAAAAAA ACGGCAAAAC CCCAAGACCC CGCCATGCGC AACCTGCAAG 

251 AACAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG 

301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAAGCGGCA TTATCGGCAA 

351 CTCCGCCCAC ACCGTTTCCG AACCCCAAAC CGGACATTCC GCACCGAAAC 

401 CTGCCGACGC GCCGGCAAAA CCTGCACCCG TTCCGCAAAC ACCTGCAAAA 

4 51 CCGCTGATTA CGCTCAAAGA ACTGTCAAAA GTCGAATTAC CCTGGTTTGA 

501 CGTGCGCTTC GACTTCATCT CCTATATCGC GCTGACCGAA GCCAAAGAAC 

551 TGCACGCACT GCCGCGCCTT TCCAACCGCT GCCGCTACCA GATTGTCGGC 

601 TGCACCATGG ACGACCATTT CCAGATTGCC GAACCCATCC CGGGCATCCG 

651 CTATCAGGCA TTTATCGTGG GTATTCAGGC AGTCAGCCGC AACGGACTTG 

701 CCTCGCAGGA AGAACTCTCC GCATTCAACC GCCAGGTGGA CGCATTCGCA 

751 CAAAGCATGG GCGGTCAGAC GCTGCACACC GACCTTGCCG CCTTTATCGA 

801 AGTGGCTTCC GCACTGGACG CATTCTGCGC GCGCGTCGAC CAGACCATCG 

851 CCATCCATTT GGTTTCCCCG ACCAGCATCA GCGGCGTAGA ACTGCGTTCC 

901 GCCGTAACGG GCGTGGGTTT CGTTTTGGAA GACGACGGCG CGTTCCACTA 

951 TACCGACACG TCGGGCTCGA CCATGTTCTC CATCTGCTCG CTCAACAACG 

1001 AGCCGTTTAC CAACGCCCTT TTGGACAACC AGTCCTACAA AGGCTTCAGT 

1051 ATGCTGCTCG ACATCCCGCA CTCTCCGGCA GGCGAAAAAA CCTTCGACGA 

1101 TTTGTTTATG GATTTGGCGG TACGCCTGTC CGGCCAGTTG AACCTGAATC 

1151 TGGTCAACGA CAAAATGGAA GAAGTTTCGA CCCAATGGCT CAAAGACGTG 

1201 CGCACTTATG TATTGGCGCG TCAGTCCGAG ATGCTCAAAG TCGGTATCGA 

1251" ACCGGGCGGC AAAACCGCAT TGCGCCTGTT CTCCTAA 



This corresponds to the amino acid sequence [<SEQ ID 526; PRF119-1>] (SEP ID NP: 526; 
PRF119-1) : 
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1 MIYIVLFLAV VLAWA YNNY QENQYRKKVR DQFGHSDKDA LLNSKTSHVR 
51 DGKPSGGSVM MPKPQPAVKK TAKPQDPAMR NLQEQDAVYI AKQKQAKASP 
101 FKTEIETALE ESGIIGNSAH TVSEPQTGHS APKPADAPAK PAPVPQTPAK 
151 PLITLKELSK VELPWFDVRF DFISYIALTE AKELHALPRL SNRCRYQIVG 
5 201 CTMDDHFQIA EPIPGIRYQA FIVGIQAVSR NGLASQEELS AFNRQVDAFA 

251 QSMGGQTLHT DLAAFIEVAS ALDAFCARVD QTIAIHLVSP TSISGVELRS 
301 AVTGVGFVLE DDGAFHYTDT SGSTMFSICS LNNEPFTNAL LDNQSYKGFS 
351 MLLDIPHSPA' GEKTFDDLFM DLAVRLSGQL NLNLVNDKME EVSTQWLKDV 
401 RTYVLARQSE MLKVGIEPGG KTALRLFS * 

10 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF1 19 (SEP ID NO: 524) shows 93.7% identity over a 175aa overlap with an ORF (ORF1 19a) 
(SEP ID NO: 528) from strain A of N. meningitidis: 

15 10 20 30 40 50 60 

or f 1 1 9 . pep MIYIVLFLAWLAWAYNMYQENQYRKKVRDQFGHSDKDALLNSXTSHVRDGKPSGGSVM 

lllllllhl III Ml llllllillllMIIMIMI IIMM llllllllllll II 

orf 119a MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM 

10 20 30 40 50 60 

20 70 80 90 100 110 120 

orf 119. pep MPKPQPAVKKTAKPQDPXMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH 

llllllllllll I III Mill 1 1 Mill II M III 1 1 1 II Ml 1 1 1 Ml I INN 

orf 119a MPKPQPAVKKTAKSQD PAMRNLQEQDAVY I AKQKQAKAS P FKTE I ETALEESG 1 1 GNS AH 

70 80 90 100 110 120 

25 130 140 150 160 170 

orf 119 .pep TVSEPQTGHSATKPADASAKPAPVPQTPAKPLITLKELSKVELSWFDVRIDFISY 

II Mil I II I Mill llhlllllllllllllllllllll II 1 1 hi II II 

orf 119a TVPEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE 

130 140 150 160 170 180 

30 orf 119a AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS 

190 200 210 220 230 240 

The complete length ORF1 19a nucleotide sequence [<SEQ ID 527>] (SEP ID NO: 527) is: 

1 ATGATTTACA TCGTACTGTT CCTCGCCGCC GTCCTCGCCG TTGTCGCCTA 

35 51 CAATATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG 

101 GGCACTCCGA CAAAGATGCC CTGCTCAACA GCAAAACCAG CCATGTCCGC 

151 GACGGCAAAC CGTCCGGCGG GCCAGTCATG ATGCCGAAAC CCCAACCGGC 

201 GGTCAAAAAA ACGGCAAAAT CCCAAGACCC CGCCATGCGC AACCTGCAAG 

251 AGCAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG 

40 301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAAGCGGCA TTATCGGCAA 

351 CTCCGCCCAC ACCGTTCCCG AACCCCAAAC CGGACATTCC GCACCAAAAC 

401 CTGCCGACGC GCCGGCAAAA CCTGTTCCCG TTCCGCAAAC GCCGGCAAAA 

451 CCGCTGATTA CGCTCAAAGA GCTGTCGAAG GTCGAGCTGC CCTGGTTTGA 

501 CGTGCGCTTC GACTTCATCT CTTATATCGC GCTGACCGAA GCCAAAGAAC 

45 551 TGCACGCACT GCCGCGCCTT TCCAACCGCT GCCGCTACCA GATTGTCGGC 

601 TGCACCATGG ACGACCATTT CCAGATTGCC GAACCCATCC CGGGCATCCG 

651 CTATCAGGCA TTTATCGTGG GTATTCAGGC AGTCAGCCGC AACGGACTTG 

701 CCTCGCAGGA AGAACTCTCC GCATTCAACC GCCAGGTGGA TGCATTCGCA 
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• 751 CACAGCATGG GCGGTCAGAC GCTGCACACC GACCTTGCCG CCTTTATCGA 

801 AGTGGCTTCC GCACTGGACG CATTCTGCGC GCGCGTCGAC CAGACTATCG 

851 CCATCCATTT GGTTTCCCCG ACCAGCATCA GCGGCGTAGA ACTGCGTTCC 

901 GCCGTAACGG GCGTGGGTTT CGTTTTGGAA GACGACGGCG CGTTCCACTA 

5 951 TACCGACACG TCGGGCTCGA CCATGTTCTC CATCTGCTCG CTCAACAACG 

1001 AGCCGTTTAC CAATGCCCTT TTGGACAACC AGTCCTATAA AGGCTTCAGT 

1051 ATGCTGCTCG ACATCCCGCA CTCTCCGGCA GGCGAAAAAA CCTTCGACGA 

1101 TTTGTTTATG GATTTGGCGG TACGCCTGTC CGGCCAGTTG AACCTGAATC 

1151 TGGTCAACGA CAAAATGGAA GAAGTTTCGA CCCAATGGCT CAAAGACGTG 

10 1201 CGCACTTATG TATTGGCTCG TCAGTCCGAG ATGCTCAAAG TCGGTATCGA 

1251 ACCGGGCGGC AAAACCGCAT TGCGCCTGTT CTCCTAA 

This encodes a protein having amino acid sequence [<SEQ ID 528>] (SEP ID NO: 528) : 



1 MIYIVLFLAA VLAWAYNMY QENQYRKKVR DQFGHSDKDA LLNSKTSHVR 

15 51 DGKPSGGPVN MPKPQPAVKK TAKSQDPAMR NLQEQDAVYI AKQKQAKASP 

101 FKTEIETALE ESGIIGNSAH TVPEPQTGHS APKPADAPAK PVPVPQTPAK 

151 PLITLKELSK VELPWFDVRF DFISYIALTE AKELHALPRL SNRCRYQIVG 

201 CTMDDHFQIA EPIPGIRYQA FIVGIQAVSR NGLASQEELS AFNRQVDAFA 

251 HSMGGQTLHT DLAAFIEVAS ALDAFCARVD QTIAIHLVSP TSISGVELRS 

20 301 AVTGVGFVLE DDGAFHYTDT SGSTMFSICS LNNEPFTNAL LDNQSYKGFS 

351 MLLDIPHSPA GEKTFDDLFM DLAVRLSGQL NLNLVNDKME EVSTQWLKDV 

401 RTYVLARQSE MLKVGIEPGG KTALRLFS* 

ORF119a (SEP ID NO: 528) and ORF119-1 fSEO ID NO: 526) show 98.6% identity in 428 aa 
25 overlap: 



10 20 30 40 50 60 

orf 11 9a. pep MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM 

MMMMMMMMMIMIMM IIMIIIIIIIIIMI MM MIIMI II 

orf 119-1 MIYIVLFLAWLAWAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGSVM 
30 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 119a. pep MPKPQPAVKKTAKSQDPAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGI IGNSAH 

II II I II 1 1 IN I II 1 1 1 1 1 1 1 1 1 1 1 1 i II ,1 1 1 1 1 1 1 1 !l 1 1 1 1 1 1 1 1 1 1 i 1 1 1 M 

orf 119-1 MPKPQPAVKKTAKPQDPAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGI IGNSAH 

35 70 80 90 100 110 120 

130 140 150 160 170 180 

orf 119a. pep TVPEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE 

II I Ml I Ml II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 MM 1 1 1 1 1 1 1 Ml I II 1 1 1 1 1 1 1 M 1 1 1 1 1 

orf 119-1 TVSEPQTGHSAPKPADAPAKPAPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE 
40 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 119a. pep AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS 

III II II ;ll II III II III II III II II II II. II II Ml M III II II MM 

orf 119-1 AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS 
45 190 200 210 220 230 240 

250 260 270 280 290 300 

orf 11 9a. pep AFNRQVDAFAHSMGGQTLHTDLAAFIEVASALDAFCARVDQTIAIHLVSPTSISGVELRS 

1 1 1 1 1 1 1 1 IM 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1! 1 1 II 1 1 1 1 M I II II 1 1 

orf 119-1 AFNRQVDAFAQSMGGQTLHTDLAAF I EVAS ALDAFCARVDQT I A I HLVS PTS I SGVELRS 

50 . 250 260 270 280 290 300 
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310 320 330 340 350 360 

orf 119a . pep AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA 

1 1 1 1 1 1 1 1 1 1 ! (III! Ill [ 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 

orf 119-1 AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA 

310 320 330 340 350 360 

370 380 . 390 400 410 420 

orf 119a . pep GEKTFDDLFMDLATOLSGQLNLNLVNDKMEEVSTQWLKDVRTYVLARQSEMLKVGIEPGG 

! 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 II I II 1 1 1 1 M II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 

or f 11 9 - 1 GEKTFDDLFMDLAVRLSGQLNLNLVl^KMEEVSTQWLKDVRTYVLARQSEMLKVGIEPGG 

370 380 390 400 410 420 

429 

orf 119a. pep KTALRLFSX 

IMIIIIII 
orf 119-1 KTALRLFSX 

Homology with a predicted ORF from N. gonorrhoeae 

ORF119 (SEP ID NO: 524) shows 93.1% identity over a 175aa overlap with a predicted ORF 
(ORF1 19ng) (SEP ID NO: 530) from N. gonorrhoeae: 



orf 119 .pep MIYIVLFLAWLAWAYNMYQENQYRKKVRDQFGHSDKDALLNSXTSHVRDGKPSGGSVM 60 

1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 ! M! 1 1 1 1 1 1 1 1 1 1 1 I Mill II II II II 

orf 119ng MIYIVLFLAAVLAWAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM 6 0 

orf 119. pep MPKPQP AVKKTAKPQDPXMRNLQEQDAVY I AKQKQAKAS P FKTE I ETALEESG 1 1 GNS AH 12 0 

IMIIIIII MM II 1 1 II 1 1 III 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II M 1 1 I II II 1 1 1 

orf 119rig MPKPQPAVKKPAKPQDSAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEEIGIIGNSAH 12 0 

orf 119. pep TVSEPQTGHSATKPADASAKPAPVPQTPAKPLITLKELSKVELSWFDVRIDFI SY 175 

IMIIIIIMI Mill II Ml 1 1 M 1 1 1 1 1 1 1 1 II I III M IMIIMMM 

orf 119ng TVSEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE 180 

The complete length PRF1 19ng nucleotide sequence [<SEQ ID 529>] (SEP ID NP: 529) is: 



1 ATGATTTACA TCGTACTGTT CCTCGCCGCC GTCCTCGCCG TTGTCGCCTA 

51 CAATATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG 

101 GACACTCCGA CAAAGATGCC CTGCTCAACA GCAAAACCAG CCATGTCCGC 

151 GACGGCAAAC CGTCCGGCGG GCCAGTCATG ATGCCGAAAC CCCAACCGGC 

201 GGTCAAAAAA CCGGCCAAAC CCCAAGACTC CGCCATGCGC AACCTGCAAG 

2 51 AACAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG 

3 01 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAATCGGCA TTATCGGCAA 

3 51 CTCCGCCCAC ACCGTTTCCG AACCCCAAAC CGGACATTCC GCACCGAAAC 

4 01 CTGCCGACGC GCCGGCAAAA CCCGTTCCCG TTCCGCAAAC GCCGGCAAAA 
4 51 CCGCTGATTA CGCTCAAAGA GCTGTCGAAG GTCGAGCTGC CCTGGTTTGA 
501 CGTGCGCTtC gACTTCATCT CCTATATCGC GCTGACCGAA GCCAAAGAAC 
551 TGCACGCACT GCCGCGCCTT tCCAACCGCT GCCGCTACCA GATTGTCGGC 
601 TGCACCATGG ACGACCATTT CCAGATTGCC GAACCCATCC CGGGCATCCG 
651 CTATCAGGCA TTTATCGTGG GTATCCAGGC AGTCAGCCGC AACGGACTTG 
701 CCTCGCAGGA AGAACTCTCC GCATTCAACC GCCAGGCGGA CGCATTCGCA 
751 CAAAGCATGG GCGGTCAGAC GCTGCACACC GACCTTGCCG CCTTTATCGA 
801 AGTGGCTTCC GCACTGGACG CATTCTGCGC GCGCGTCGAC CAGACCATCG 
851 CCATCCATTT GGTTTCGCCG ACCAGCATCA GCGGCGTAGA ACTGCGTTCC 
901 GCCGTAACGG GCGTGGGTTT CGTTTTGGAA GACGACGGCG CGTTCCACTA 
951 TACCGACACG TCGGGCTCGA CCATGTTCTC CATCTGCTCG CTCAACAACG 
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1001 AGCCGTTTAC CAATGCCCTT TTGGACAACC AGTCCTACAA AGGCTTCAGT 

1051 ATGCTGCTCG ACATCCCGCA CTCTCCGGCA GGCGAAAAAA CCTTCGACGA 

1101 TTTGTTTATG GATTTGGCGG TACGCCTGTC CGGTCAGTTG AACCTGAATC 

1151 TGGTCAACGA CAAAATGGAA GAAGTTTCGA CCCAATGGCT CAAAGACGTA 

5 1201 CGCACTTATG TATTGGCGCG TCAGTCCGAG ATGCTCAAAG TCGGTATCGA 

12 51 ACCGGGCGGC AAAACCGCCC TGCGCCTGTT TTCATAA 

This encodes a protein having amino acid sequence [<SEQ ID 530>] (SEP ID NO: 530) : 

1 MIYIVLFLAA VLAWA YNMY QENQYRKKVR DQFGHSDKDA LLNSKTSHVR 

10 51 DGKPSGGPVM MPKPQPAVKK PAKPQDSAMR NLQEQDAVYI AKQKQAKAS P 

101 FKTEIETALE EIGIIGNSAH TVSEPQTGHS APKPADAPAK PVPVPQTPAK 

151 PLITLKELSK VELPWFDVRF DFISYIALTE AKELHALPRL SNRCRYQIVG 

201 CTMDDHFQIA EPIPGIRYQA FIVGIQAVSR NGLASQEELS AFNRQADAFA 

251 QSMGGQTLHT DLAAF I EVAS ALDAFCARVD QTIAIHLVSP TSISGVELRS 

15 3 01 AVTGVGFVLE DDGAFHYTDT SGSTMFSICS LNNEPFTNAL LDNQSYKGFS 

351 MLLDIPHSPA GEKTFDDLFM DLAVRLSGQL NLNLVNDKME EVSTQWLKDV 

401 RTYVLARQSE MLKVGIEPGG KTALRLFS* 

ORF1 19ng (SEP ID NO: 530) and ORF1 19-1 (SEP ID NO: 526) show 98.4% identity over 428 aa 
20 overlap: 

10 20 30 40 50 60 

orf 119ng MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM 

IIIIIIIMIIIIII IIIIIMIMIIMIII llllllll MIMIIIMII II 

orf 119-1 MIYIVLFLAWLAWAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGSVM 
25 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 119ng MPKPQPAVKKPAKPQDSAMRNLQEQDAVY I AKQKQAKAS P FKTE I ETALEE IG 1 1 GNS AH 

IIIIIMIM Mill M I II 1 1 1 1 1 1 1 1 1 1 1 1 i II 1 1 1 1 1 M 1 1 1 1 M II M 1 1 II 

orf 119-1 MP KPQP AVKKTAKPQDPAMRNLQEQDAVY I AKQKQAKAS P FKTE I ETALEESG 1 1 GNS AH 

30 70 80 90 100 110 120 

130 140 150 160 170 180 

orf 119ng TVS E PQTGHS APKPADAPAKP VPVPQTP AKPL I TLKELS KVELPWFDVRFDF I S Y I ALTE 

I I I I I I I I I I I I I I I I ! I I ! I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 119 - 1 TVSEPQTGHS APKPADAPAKPAPVPQTPAKPLITLKELSKVELPWFDVRFDFISYI ALTE 

35 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 119ng AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 119-1 AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS 
40 190 200 210 220 230 240 

250 260 270 280 290 300 

orf 1 1 9ng AFNRQADAFAQSMGGQTLHTDLAAF I EVAS ALDAFCARVDQT I AI HLVS PTS ISGVELRS 

MIIMI MINIM IIIIIMIMIIMIII IIIIMIIIMIIIIIIIIIIIII . 

orf 119-1 AFNRQVDAFAQSMGGQTLHTDLAAF I EVAS ALDAFCARVDQT I AIHLVS PTS I SGVELRS 

45 250 260 270 280 290 300 

310 320 330 340 350 360 

orf 119ng AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA 

IIIMIIIII MM IIIMIIMIIMIMII IMMII lllllll III! IMMIMI 

orf 119-1 AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA 
50 310 320 330 340 350 360 
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orf 119ng 



370 380 390 400 410 420 

GEKTFDDLFMDLATOLSGQLNI^LVNDKMEEVSTQWLKDTOTWLARQSEMLKVGIEP^ 



orf 119-1 




370 380 390 400 410 420 



orf 119ng 



429 

KTALRLFSX 



orf 119-1 



Illllllll 
KTALRLFSX 



Based on this analysis, including the presence of a putative leader sequence in the gonococcal 
protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, 
could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 53 1>] (SEP ID 



1 . . GCGCGGCACG GCACGGAAGA TTTCTTCATG AACAACAGCG ACAC . ATCAG 

51 GCAGATAGTC GAAAGCACCA CCGGTACGAT GAAGCTGCTG ATTTCCTCCA 

101 TCGCCCTGAT TTCATTGGTA GTCGGCGGCA TCGGCGTGAT GAACATCATG 

151 CTGGTGTCCG TTACCGAGCG CACGAAAGAA ATCGGCATAC GGATGGCAAT 

201 CGGCGCGCGG CGCGGCAATA TTTyGCAGCA GTTTTTGATT GAGGCGGTGT 

251 TAATCTGCGT CATCGGCGGT TTGGTCGGCG TGGGTTTGTC CGCCGCCGTC 

3 01 AGCCTCGTGT TCAATCATTT TGTAACCGAC TTCCCGATGG ACATTTCCGC 
351 CATGTCCGTC ATCGGCGCGG TCGCCTGTTC GACCGGAATC GGCATCGCGT 

4 01 TCGGCTTTAT GCCTGCCAAT AAAGCAGCCA AACTCAATCC GATAGACGCA 
4 51 TTGGCACAGG ATTGA 



This corresponds to the amino acid sequence [<SEQ ID 532; ORF134>] (SEP ID NO: 532; 



1 . . ARHGTEDFFM NNSDXIRQIV ESTTGTMKLL ISSIALISLV VGGIGVMNIM 

51 LVSVTERTKE IGIRMAIGAR RGNIXQQFLI EAVLICVIGG LVGVGLSAAV 

101 SLVFNHFVTD FPMDISAMSV IGAVACSTGI GIAFGFMPAN KAAKLNPIDA 

151 LAQD* 



Further work revealed the complete nucleotide sequence [<SEQ ID 533>] (SEP ID NP: 533) : 



1 ATGTCGGTGC AAGCAGTATT GGCGCACAAA ATGCGTTCGC TTCTGACGAT 

51 GCTCGGCATC ATCATCGGTA TCGCGTCGGT GGTTTCCGTC GTCGCATTGG 

101 GCAATGGTTC GCAGAAAAAA ATCCTTGAAG ACATCAGTTC GATAGGGACG 

151 AACACCATCA GCATCTTCCC GGGGCGCGGC TTCGGCGACA GGCGCAGCGG 

2 01 CAGGATTAAA ACCCTGACCA TAGACGACGC AAAAATCATC GCCAAACAAA 

2 51 GCTACGTTGC TTCCGCCACG CCCATGACTT CGAGCGGCGG CACGCTGACT 

3 01 TACCGCAACA CCGACCTGAC CGCCTCGCTT TACGGCGTGG GCGAACAATA 
351 TTTCGACGTG CGCGGACTGA AGCTGGAAAC GGGGCGGCTG TTTGACGAAA 

4 01 ACGATGTGAA AGAAGACGCG CAGGTCGTCG TCATCGACCA AAATGTCAAA 
4 51 GACAAACTCT TTGCGGACTC GGATCCGTTG GGTAAAACCA TTTTGTTCAG 



Example 64 



NP: 53 1) 



PRF134) : 
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501 GAAACGCCCC TTGACCGTCA TCGGCGTGAT 

551 TCGGCAATTC CGACGTGCTG ATGCTTTGGT 

601 CACCAAATCA CAGGCGAGAG CCACACCAAC 

651 AGACAATGCC AATACCCAGG TTGCCGAAAA 

701 AAGCGCGGCA CGGCACGGAA GATTTCTTCA 

751 AGGCAGATAG TCGAAAGCAC CACCGGTACG 

801 CATCGCCCTG ATTTCATTGG TAGTCGGCGG 

851 TGCTGGTGTC CGTTACCGAG CGCACCAAAG 

901 ATCGGCGCGC GGCGCGGCAA TATTTTGCAG 

951 GTTAATCTGC GTGATCGGCG GTTTGGTCGG 

1001 TCAGCCTCGT GTTCAATCAT TTTGTAACCG 

1051 GCCATGTCCG TCATCGGCGC GGTCGCCTGT 

1101 GTTCGGCTTT ATGCCTGCCA ATAAAGCAGC 

1151 CATTGGCACA GGATTGA 

This corresponds to the amino acid sequence [<SEQ ID 534; PRF134-1>] (SEP ID NO: 534; 
PRF134-1) : 



1 MSVQAVLAHK MRSLLTMLGI IIGIASWSV VALGNGSQKK ILEDISSIGT 

51 NTISIFPGRG FGDRRSGRIK TLTIDDAKI I AKQSYVASAT PMTSSGGTLT 

101 YRNTDLTASL YGVGEQYFDV RGLKLETGRL FDENDVKEDA QVWIDQNVK 

151 DKLFADSDPL GKTILFRKRP LTVIGVMKKD ENAFGNSDVL MLWS PYTTVM 

201 HQITGESHTN SITVKIKDNA NTQVAEKGLT DLLKARHGTE DFFMNNSDSI 

251 RQIVESTTGT MKL LISSIAL ISLWGGIGV MNIMLVSVTE RTKEIGIRMA 

3 01 IGARRGNILQ Q FLIEAVLIC VIGGLVGV GL SAAVSLVFNH FVTDFPMDIS 

351 AMS VIGAVAC STGIGIAFGF MPANKAAKLN PIDALAQD* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with the hypothetical protein o648 (SEP ID NO: 1144) of Exoli (accession number 
AE000189) 



ORF134 rSEOID NO: 532) and o648 protein (SEP ID NO: 1144) show 45% aa identity in 153aa 
overlap: 



Orf 134 : 


2 


RHGTEDFFMNNSDXIRQIVESTTGTMKXXXXXXXXXXXVVGGIGVMNIMLVSVTERTKEI 


61 






RHG +DFF N D + + VE TT T++ WGGIGVMNIMLVSVTERT+EI 




o648 : 


496 


RHGKKDFFTWNMDGVLKTVEKTTRTLQLFLTLVAVISLWGGIGVMNIMLVSVTERTREI 


555 


Orf 134 : 


62 


GIRMAIGARRGNIXQQFLIEAXXXXXXXXXXXXXXXXXXXXXFNHFVTDFPMDISAMSVI 


121 






GIRMA+GAR ++ QQFLIEA F+ + + S + + + + 




0648: 


556 


GIRMAVGARASDVLQQFLIEAVLVCLVGGALGITLSLLIAFTLQLFLPGWEIGFSPLALL 


615 


Orf 134 : 


122 


GAVACSTGIGIAFGFMPANKAAKLNPIDALAQD 154 








A CST GI FG++PA AA+L+P+DALA++ 




o648 : 


616 


LAFLCSTVTGILFGWLPARNAARLDPVDALARE 64 8 





Homology with a predicted PRF from N.meninsitidis (strain A) 



GAAAAAAGAC 
CGCCCTATAC 
TCCATCACCG 
AGGGCTGACC 
TGAACAACAG 
ATGAAGCTGC 
CATCGGCGTG 
AAATCGGCAT 
CAGTTTTTGA 
CGTGGGTTTG 
ACTTCCCGAT 
TCGACCGGAA 
CAAACTCAAT 



GAAAACGCTT 
GACGGTGATG 
TCAAAATCAA 
GATCTGCTCA 
CGACAGCATC 
TGATTTCCTC 
ATGAACATCA 
ACGGATGGCA 
TTGAGGCGGT 
TCCGCCGCCG 
GGACATTTCC 
TCGGCATCGC 
CCGATAGACG 
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ORF134 (SEP ID NO: 532^ shows 98.7% identity over a 154aa overlap with an ORF 
(SEP ID NO: 536) from strain A of N. meningitidis: 



10 20 30 

ARHGTEDFFMNNSDXIRQIVESTTGTMKLL 

Mill IIMIMI lllllllllllllll 
GESHTNSITVKIKDNANTQVAEKGLTDLLKARHGTEDFFMNNSDSIRQIVESTTGTMKLL 
210 220 230 240 250 260 

40 50 60 70 80 90 

ISSIALISLWGGIGVMNIMLVSVTERTKEIGIRMAIGARRGNIXQQFLIEAVLICVIGG 

1 1 1 1 1 1 1 1 1 ii i : 1 1 1 1 ; 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 iiiiiiiiiiiiiii 

ISSIALISLWGGIGVMNIMLVSVTERTKEIGIRMAIGARRGNILQQFLIEAVLICVIGG 
270 280 290 300 310 320 



orf 134 .pep 
orf 134a 

orf 134 .pep 
orf 134a 



100 110 120 130 140 150 

orf 134 . pep LVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACSTGIGIAFGFMPANKAAKLNPIDA 

I I I I M I I I I I I I I I I I I I I I I I M II I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I 
orf 134a LVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACSTGIGIAFGFMPANKAAKLNPIDA - 

330 340 350 360 370. 380 



orf 134. pep LAQDX 
Mill 

orfl34a LAQDX 

The complete length PRF134a nucleotide sequence [<SEQ ID 535>] (SEP ID NP: 535) is 



1 


ATGTCGGTGC 


AAGCAGTATT 


51 


GCTCGGCATC 


ATCATCGGTA 


101 


GCAACGGTTC 


GCAGAAAAAA 


151 


AACACCATCA 


GCATCTTCCC 


201 


CAGGATTAAA 


ACCCTGACCA 


251 


GCTACGTTGC 


TTCCGCCACG 


301 


TACCGCAATA 


CCGACCTGAC 


351 


TTTCGACGTG 


CGCGGGCTGA 


401 


ACGATGTGAA 


AGAAGACGCG 


451 


GACAAACTCT 


TTGCGGACTC 


501 


GAAACGCCCC 


TTGACCGTCA 


551 


TCGGCAATTC 


CGACGTGCTG 


601 


CACCAAATCA 


CAGGCGAGAG 


651 


AGACAATGCC 


AATACCCAGG 


701 


AAGCGCGGCA 


CGGCACGGAA 


751 


AGGCAGATAG 


TCGAAAGCAC 


801 


CATCGCCCTG 


ATTTCATTGG 


851 


TGCTGGTGTC 


CGTTACCGAG 


901 


ATCGGCGCGC 


GGCGCGGCAA 


951 


GTTAATCTGC 


GTCATCGGCG 


1001 


TCAGCCTCGT 


GTTCAATCAT 


1051 


GCCATGTCCG 


TCATCGGCGC 


1101 


GTTCGGCTTT 


ATGCCTGCCA 


1151 


CATTGGCGCA 


GGATTGA 



This encodes a protein having amino aci 



GGCGCACAAA ATGCGTTCGC TTCTGACGAT 
TCGCTTCGGT TGTCTCCGTC GTCGCATTGG 
ATCCTTGAAG ACATCAGTTC GATAGGGACG 
AGGGCGCGGC TTCGGCGACA GGCGCAGCGG 
TAGACGACGC AAAAATCATC GCCAAACAAA 
CCCATGACTT CGAGCGGCGG CACGCTGACT 
CGCTTCTTTG TACGGTGTGG GCGAACAATA 
AGCTGGAAAC GGGGCGGCTG TTTGACGAAA 
CAGGTCGTCG TCATCGACCA AAATGTCAAA 
GGATCCGTTG GGTAAAACCA TTTTGTTCAG 
TCGGCGTGAT GAAAAAAGAC GAAAACGCTT 
ATGCTTTGGT CGCCCTATAC GACGGTGATG 
CCACACCAAC TCCATCACCG TCAAAATCAA 
TTGCCGAAAA AGGGCTGACC GATCTGCTCA 
GATTTCTTCA TGAACAACAG CGACAGCATC 
CACCGGTACG ATGAAGCTGC TGATTTCCTC 
TAGTCGGCGG CATCGGCGTG ATGAACATCA 
CGCACCAAAG AAATCGGCAT ACGGATGGCA 
TATTTTGCAG CAGTTTTTGA TTGAGGCGGT 
GTTTGGTCGG CGTGGGTTTG TCCGCCGCCG 
TTTGTAACCG ACTTCCCGAT GGACATTTCC 
GGTCGCCTGT TCGACCGGAA TCGGCATCGC 
ATAAAGCAGC CAAACTCAAT CCGATAGATG 



sequence [<SEQ ID 536>] (SEP ID NP: 536) : 



1 MSVQAVLAHK MRSLLTMLGI IIGIASWSV VALGNGSQKK ILEDISSIGT 
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51 
101 
151' 
201 
251 
301 
351 



NTISIFPGRG 
YRNTDLTASL 
DKLFADSDPL 
HQITGESHTN 
RQIVESTTGT 
I GARRGN I LQ 
AMSVIGAVAC 



FGDRRSGRIK TLTIDDAKI I 
YGVGEQYFDV RGLKLETGRL 
GKTILFRKRP LTVIGVMKKD 
SITVKIKDNA NTQVAEKGLT 
MKL LISSIAL ISLWGGIGV 
Q FLIEAVLIC VIGGLVGV GL 
STGIGIAFGF MPANKAAKLN 



AKQSYVASAT 
FDENDVKEDA 
ENAFGNSDVL 
DLLKARHGTE 
MNIMLVSVTE 
SAAVSLVFNH 
PIDALAQD* 



PMTSSGGTLT 
QVWIDQNVK 
MLWSPYTTVM 
DFFMNNSDSI 
RTKEIGIRMA 
FVTDFPMDIS 



10 



ORF134a (SEP ID NO: 536) and ORF134-1 (SEP ID NO: 534) show 100.0% identity in 388 aa 
overlap: 



15 



20 



25 



30 



orf 134a. pep MS VQAVLAHKMRSLLTMLG III G I AS WS WALGNGSQKKI LED I S S I GTNT I S I FPGRG 

IIIIIIIIMIII IIMIIIIIMIIMMMIIIIIII IIIIIIMIIMIIIII 

orf 134-1 MS VQAVLAHKMRSLLTMLG III G IAS WS WALGNGSQKKI LED I S S I GTNT I S I FPGRG 

orf 134a. pep FGDRRSGR I KTLT I DDAKI I AKQSYVASATPMTSSGGTLT YRNTDLTASL YGVGEQYFDV 

I II I 1 1 Ml 1 1 1 1 Mi I II 1 1 1 II M I II II !l 1 1 1 1 1 1 1 1 1 Ml III 1 1 Mill II 

orf 134 - 1 FGDRRSGRI KTLT IDDAKI I AKQS YVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV 

orf 134a .pep RGLKLETGRLFDENDVKEDAQVWIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKD 

I 1 1 I ! I I I 1 1 I I ' I I 1 1 I I I I 1 I 1 1 I I I I I I I 1 1 I 1 1 I I I I I I 

orf 134 - 1 RGLKLETGRLFDENDVKEDAQVWIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKD 
orf 134a . pep ENAFGNSDVLMLWS PYTTVMHQ I TGESHTNS I TVKI KDNANTQVAEKGLTDLLKARHGTE 

1 1 1 1 1 1 1 MM II 1 1 1 II M 1 1 M II Mill 1 1 II M I M 1 1 II II 1 1 Ml 1 1 1 1 1 1 

orf 134 - 1 ENAFGNSDVLMLWS PYTTVMHQ I TGESHTNS I TVKI KDNANTQVAEKGLTDLLKARHGTE 

orf 134a. pep DFFMNNSDSIRQIVESTTGTMKLLISSIALISLWGGIGVMNIMLVSVTERTKEIGIRMA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M I II I 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 

orfl34-l DFFMNNSDSIRQIVESTTGTMKLLISSIALISLWGGIGVMNIMLVSVTERTKEIGIRMA 
orf 134a . pep I GARRGN I LQQFL I EAVL I CV I GGLVGVGLSAAVS LVFNHFVTDFPMD I SAMSVI GAVAC 

MMMMMMIMM MM MM MMMMMMMMMMMMMIMM 

orf 134 - 1 I GARRGN I LQQFL I EAVL I CV I GGLVGVGLSAAVS LVFNHFVTDFPMD I SAMS VI GAVAC 

orf 134a .pep STGIGIAFGFMPANKAAKLNPIDALAQDX 

M II 1 1 II I II M II I II M I II M I 

orf 134 - 1 STGIGIAFGFMPANKAAKLNPIDALAQDX 

Homology with a predicted ORF from N. gonorrhoeae 

ORF134 (SEP ID NO: 532) shows 96.8% identity over a 154aa overlap with a predicted ORF 
(ORF134.ng) fSEO ID NP: 538) from N. gonorrhoeae: 



35 



orf 134 .pep 
orf 134ng 



ARHGTEDFFMNNSDXIRQI VESTTGTMKLL 3 0 

llllllllllllll MMIIIMM M 

GESHTNSITVKIKDNANTRVAEKGLAELLKARHGTEDFFMNNSDSIRQMVESTTGTMKLL 264 



orf 134 .pep 



40 



I S S I AL I S L WGG I GVMN I MLVS VTERTKE I G I RMA I GARRGN IXQQ FL I EAVL I CV I GG 90 

1 1 1 ! 1 1 II I i 1 1 1 M 1 1 1 1 1 M 1 1 1 i I < I MM III IIIIIIIMIhlll 

orf 134ng I S S I AL I S L WGG I GVMN I ML VS VTERT KE I G I RMA I GARRGN I LQQ FL I EAVL I C 1 1 GG 324 
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orf 134 .pep LVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACSTGIGIAFGFMPANKAAKLNPIDA 150 

• 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 II 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 

orf 134ng LVGVGLS AAVSLVFNHFVTDFPMD I SAAS VIGAVACSTG I G I AFGFMPANKAAKLNP I DA 384 

orf 134. pep LAQD 154 

5 I 1 1 I 

orf 134ng LAQD 388 

The complete length ORF134ng nucleotide sequence [<SEQ ID 537>] (SEP ID NO: 537) is: 

1 ATGTCGGTGC AAGCAGTATT GGCGCACAAA ATGCGTTCGC TTCTGACCAT 

10 51 GCTCGGCATC ATCATCGGTA TCGCTTCGGT TGTCTCCGTC GTCGCGCTGG 

101 GCAACGGTTC GCAGAAAAAA ATCCTCGAAG ACATCAGTTC GATGGGGACG 

151 AACACCATCA GCATCTTCCC CGGGCGCGGC TTCGGCGACA GGCGCAGCGG 

201 CAAAATCAAA ACCCTGACCA TAGACGACGC AAAAATCATC GCCAAACAAA 

251 GCTACGTTGC CTCCGCCACG CCCATGACTT CGAGCGGCGG CACGCTGACC 

15 301 TACCGCAATA CCGACCTGAC CGCTTCTTTG TACGGTGTGG GCGAACAATA 

351 TTTCGACGTG CGCGGGCTGA AGCTGGAAAC GGGGCGGCTG TTTGATGAGA 

4 01 ACGATGTGAA AGAAGACGCG CAAGTCGTCG TCATCGACCA AAATGTCAAA 

4 51 GACAAACTCT TTGCGGACTC GGATCCGTTG GGTAAAACCA TTTTGTTCAG 

501 GAAACGCCCC TTGACCGTCA TCGGCGTGAT GAAAAAAGAC GAAAACGCTT 

20 551 TCGGCAATTC CGACGTGCTG ATGCTTTGGT CGCCCTATAC GACGGTGATG 

601 CACCAAATCA CAGGCGAGAG CCACACCAAC TCCATCACCG TCAAAATCAA 

651 AGACAATGCC AATACCCGGG TTGCCGAAAA AGGGCTGGCC GAGCTGCTCA 

701 AAGCACGGCA CGGCACGGAA GACTTCTTTA TGAACAACAG CGACAGCATC 

751 AGGCAGATGG TCGAAAGCAC CACCGGTACG ATGAAGCTGC TGATTTCCTC 

25 801 CATCGCCCTG ATTTCATTGG TAGTCGGCGG CATCGGTGTG ATGAACATTA 

851 TGCTGGTGTC CGTTACCGAG CGCACCAAAG AAATCGGCAT ACGGATGGCA 

901 ATCGGCGCGC GGCGCGGCAA TATTTTGCAG CAGTTTTTGA TTGAGGCGGT 

951 GTTAATCTGC ATCATCGGAG GCTTGGTCGG CGTAGGTTTG TCCGCCGCCG 

1001 TCAGCCTCGT GTTCAATCAT TTTGTAACCG ATTTCCCGAT GGACATTTCG 

30 1051 GCGGCATCCG TTATCGGGGC GGTCGCCTGT TCGACCGGAA TCGGCATCGC 

1101 GTTCGGCTTT ATGCCTGCCA ATAAGGCAGC CAAACTCAAT CCGATAGATG 

1151 CATTGGCGCA GGATTGA 

This encodes a protein having amino acid sequence [<SEQ ID 538>] (SEP ID NO: 538) : 

35 1 MSVQAVLAHK MRSLLTMLGI IIGIASWSV VALGNGSQKK ILEDISSMGT 

51 NTISIFPGRG FGDRRSGKIK TLTIDDAKII AKQSYVASAT PMTSSGGTLT 

101 YRNTDLTASL YGVGEQYFDV RGLKLETGRL FDENDVKEDA QVWIDQNVK 

151 DKLFADSDPL GKTILFRKRP LTVIGVMKKD ENAFGNSDVL MLWSPYTTVM 

201 HQITGESHTN SITVKIKDNA NTRVAE KGLA ELLKARHGTE DFFMNNSDSI 

40 251 RQMVESTTGT MKL LISSIAL ISLWGGIGV MNIMLVSVTE RTKEIGIRMA 

3 01 IGARRGNILQ Q FLIEAVLIC IIGGLVGV GL SAAVSLVFNH FVTDFPMDIS 

351 AAS VIGAVAC STGIGIAFGF MPANKAAKLN PIDALAQD* 

ORF134ng (SEP ID NO: 538) and ORF134-1 (SEP ID NO: 534) show 97.9% identity in 388 aa 
45 overlap: 

orf 134ng MSVQAVLAHKMRSLLTMLGIIIGIASWSWALGNGSQKKILEDISSMGTNTISIFPGRG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 : 1 1 M 

orf 134-1 MS VQAVLAHKMRSLLTMLG 1 1 1 G I AS WS WALGNGSQKKI LED I S S I GTNT I S I FPGRG 

orf 134ng FGDRRSGKIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV 

50 I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I 

orf 134 - 1 FGDRRSGRIKTLTIDDAKI IAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV 
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orf 134rig RGLKLETGRLFDENDVKEDAQVWIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKD 

III llllllll Mill II I II II Ml II 1 1 llllllll I II II II II Ml II II MM II 

orf 134 - 1 RGLKLETGRLFDE1TOVKEDAQVWIDQOTKDKLFADSDPLGKTILFRKRPLTVIGVMKKD 

o r f 1 3 4 ng ENAFGNSDVLMLWS P YTTVMHQ I TGESHTNS I TVKI KDNANTRVAE KGLAELLKARHGTE 

I I I I I I i I II' II I I M I I I I II II I I II I I I I I I I . I I I I hi I I I I I- I I I I I I I M 
orfl34-l ENAFGNSDVLMLWS P YTTVMHQ I TGESHTNS I TVKI KDNANTQVAE KGLTDLLKARHGTE 
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orf 134ng DFFMNNSDSIRQMVESTTGTMKLLISSIALISLWGGIGVMNIMLVSVTERTKEIGIRMA 

I I I I I I I II I I I ' I I I I I I I I I I II I I I I I I I I I I II I I II I I I I II I I I I I I I II I I II 
orf 134-1 DFFMNNSDSIRQIVESTTGTMKLLISSIALISLWGGIGVMNIMLVSVTERTKEIGIRMA 

orf 134ng I GARRGN I LQQ FL I EAVL I C 1 1 GGLVGVGL S AAVS L VFNHFVTD F PMD I S AAS V I GAVAC 

I I I I I I I I I I I I I I I I I I Ml ! I I I I I I I I I I I M I I M I I I I I I I llllllll 
orf 134 - 1 IGARRGNILQQFLIEAVLICVIGGLVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVAC 



orf 134ng STGIGIAFGFMPANKAAKLNPIDALAQDX 

I I I I I I I I I I I I I I II I I I I I I I I I I I I 
1 5 orf 134 - 1 STGIGIAFGFMPANKAAKLNPIDALAQDX 

ORF134ng (SEP ID NO: 538) also shows homology to an Exoli ABC transporter (SEO ID NO: 
1145) : 



20 



sp|P75831 | YBJ2_ECOLI HYPOTHETICAL ABC TRANSPORTER ATP-BINDING PROTEIN YBJZ )gi5 
(AE000189) o648; similar to YBBA_HAEIN SW: P45247 [Escherichia coli] Length = 648 
Score = 297 bits (753), Expect = 6e-80 

Identities = 162/389 (41%), Positives = 230/389 (58%), Gaps = 1/389 (0%) 



25 



Query: 1 MSVQAVLAHKMRSLLTMLXXXXXXXXXXXXXXLGNGSQKKILEDISSMGTNTISIFPGRG 60 

M+ +A+ A+KMR+LLTML +G+ +++ +L DI S+GTNTI ++PG+ 

Sbjct: 260 MAWRALAANKMRTLLTMLG III G I AS WS I VWGDAAKQMVLAD I RS I GTNT I DVYPGKD 319 



Query : 61 FGDRRSGKIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV 120 

FGD + L DD I KQ +VASATP S L Y N D+ AS GV YF+V 

Sbjct: 320 FGDDDPQYQQALKYDDLIAIQKQPWVASATPAVSQNLRLRYNNVDVAASANGVSGDYFNV 379 



30 



35 



40 



Query: 121 RGLKLETGRLFDENDVKEDAQVWIDQNVKDKLFAD-SDPLGKTILFRKRPLTVIGVMKK 179 

G+ G F++ + AQVW+D N + +LF +D +G+ IL P VIGV + + 
Sbjct: 380 YGMTFSEGNTFNQEQLNGRAQVWLDSNTRRQLFPHKADWGEVILVGNMPARVIGVAEE 439 

Query: 180 DENAFGNSDVLMLWSP YTTVMHQ I TGESHTNS I TVKI KDNANTRVAE KGLAELLKARHGT 239 

++ FG+S VL +W PY+T+ ++ G+S NSITV++K+ ++ AE+ L LL RHG 
Sbjct: 440 KQSMFGSSKVLRVWLPYSTMSGRVMGQSWLNSITVRVKEGFDSAEAEQQLTRLLSLRHGK 4 99 

Query: 240 EDFFMNNSDSIRQMVESTTGTMKXXXXXXXXXXXWGGIGVMNIMLVSVTERTKEIGIRM 299 

+DFF N D + + VE TT T++ WGGIGVMNIMLVSVTERT+EIGIRM 
Sbjct: 500 KDFFTWNMDGVLKTVEKTTRTLQLFLTLVAVISLWGGIGVMNIMLVSVTERTREIGIRM 559 

Query: 300 AIGARRGNILQQFLIEXXXXXXXXXXXXXXXXXXXXXXFNHFVTDFPMDISAASVIGAVA 359 

A+GAR ++LQQFLIE F+ + + S +++ A 

Sbjct: 560 AVGARASDVLQQFL I EAVLVCLVGGALGI TLSLLIAFTLQLFLPGWE IGFSPLALLLAFL 619 



Query: 360 CSTG I G I AFGFMP ANKAAKLNP IDALAQD 388 

CST GI FG++PA AA+L+P+DALA++ 
Sbjct: 620 CS T VTG I L FGWL P ARNAARLD P VDALARE 648 
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Based on this analysis, including the presence of the leader peptide and transmembrane regions in 
the gonococcal protein, it is prediceted that these proteins from N. meningitidis and N. gonorrhoeae, 
and their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 65 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ED 539>] (SEP ID 
NO: 539) : 

1 . . GGGACGGGAG CGATGCTGCT GCTGTTTTAC GCGGTAACGA T . CTGCCTTT 
51 GGCCACTGGC GTTACCCTGA GTTACACCTC GTCGATTTTT TTGGCGGTAT 
101 TTTCCTTCCT GATTTTGAAA GAACGGATTT CCGTTTACAC GCAGGCGGTG 
151 CTGCTCCTTG GTTTTGCCGG CGTGGTATTG CTGCTTAATC CCTCGTTCCG 
201 CAGCGGTCAG GAAACGGCGG CACTCGCCGG GCTGGCGGGC GGCGCGATGT 

2 51 CCGGCTGGGC GTATTTGAAA GTGCGCGAAC TGTCTTTGGC ■ GGGCGAACCC 

3 01 GGCTGGCGCG TCGTGTTTTA CCTTTCCGTG ACAGGTGTGG CGATGTCGTC 

3 51 GGTTTGGGCG ACGCTGACCG GCTGGCACAC CCTGTCCTTT CCATCGGCAG 

4 01 TTTATCTGTC GTGCATCGGC GTGTCCGCGC TGATTGCCCA ACTGTCGATG 
4 51 ACGCGCGCCT ACAAAGTCGG CGACAAATTC ACGGTTGCCT CGCTTTCCTA 
501 TATGACCGTC GTTTTTTCCG CTCTGTCTGC CGCATTTTTT CTGGGCGAAG 
551 AGCTTTTCTG GCAGGAAATA CTCGGTATGT GCATCATCAT CCTCAGCGGT 
6 01 ATTTTGA 

This corresponds to the amino acid sequence [<SEQ ID 540; ORF135>] (SEP ID NO: 540; 
PRF135): 



1 . . GTGAMLLLFY AVTILPLATG VTLSYTSSIF LAVFSFLILK ERISVYTQAV 
51 LLLGFAGWL LLNPSFRSGQ ETAALAGLAG GAMSGWAYLK VRELSLAGEP 
101 GWRWFYLSV TGVAMSSVWA TLTGWHTLS F PSAVYLSCIG VSALIAQLSM 
151 TRAYKVGDKF TVASLSYMTV VFSALSAAFF LGEELFWQEI LGMCIIISAV 
201 F* 

Further work revealed the complete nucleotide sequence [<SEQ ID 54 1>] (SEP ID NP: 541) : 



1 ATGGATACCG CAAAAAAAGA CATTTTAGGA TCGGGCTGGA TGCTGGTGGC 

51 GGCGGCCTGC TTTACCATTA TGAACGTATT GATTAAAGAG GCATCGGCAA 

101 AATTTGCCCT CGGCAGCGGC GAATTGGTCT TTTGGCGCAT GCTGTTTTCA 

151 ACCGTTGCGC TCGGGGCTGC CGCCGTATTG CGTCGGGACA mCTTCCGCAC 

2 01 GCCCCATTGG AAAAACCACT TAAACCGCAG TATGGTCGGG ACGGGGGCGA 

2 51 TGCTGCTGCT GTTTTACGCG GTAACGCATC TGCCTTTGGC CACTGGCGTT 

3 01 ACCCTGAGTT ACACCTCGTC GATTTTTTTG GCGGTATTTT CCTTCCTGAT 
351 TTTGAAAGAA CGGATTTCCG TTTACACGCA GGCGGTGCTG CTCCTTGGTT 
401 TTGCCGGCGT GGTATTGCTG CTTAATCCCT CGTTCCGCAG CGGTCAGGAA 

4 51 ACGGCGGCAC TCGCCGGGCT GGCGGGCGGC GCGATGTCCG GCTGGGCGTA 
501 TTTGAAAGTG CGCGAACTGT CTTTGGCGGG CGAACCCGGC TGGCGCGTCG 
551 TGTTTTACCT TTCCGTGACA GGTGTGGCGA TGTCGTCGGT TTGGGCGACG 
601 CTGACCGGCT GGCACACCCT GTCCTTTCCA TCGGCAGTTT ATCTGTCGTG 
651 CATCGGCGTG TCCGCGCTGA TTGCCCAACT GTCGATGACG CGCGCCTACA 
701 AAGTCGGCGA CAAATTCACG GTTGCCTCGC TTTCCTATAT GACCGTCGTT 
751 TTTTCCGCTC TGTCTGCCGC ATTTTTTCTG GGCGAAGAGC TTTTCTGGCA 
801 GGAAATACTC GGTATGTGCA TCATCATCCT CAGCGGTATT TTGAGCAGCA 
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851 TCCGCCCCAC TGCCTTCAAA CAGCGGCTGC AATCCCTGTT CCGCCAAAGA 
901 TAA 

This corresponds to the amino acid sequence [<SEQ ID 542; ORF135-l>] (SEP ID NO: 542; 
5 PRF135-1) : 

1 MDTAKKDILG SGWMLVAAA C FTIMNVLIKE ASAKFALGSG ELVFWRMLFS 
51 TVALGAAAVL RRDXFRTPHW KNHLNRS MVG TGAMLLLFYA VTHL PLATGV 
101 TLSYTSSIFL AVFSFLILKE RISVYTQAVL LLGFAGWLL LNPSFRSGQE 
151 TAALAGLAGG AMSGWAYLKV RELSLAGEPG WRWFYLSVT GVAMSSVWAT 
10 2 01 LTGWHTLS FP SAVYLSCIGV SALIA QLSMT RAYKVGDKFT VASLSYMTVV 

251 FSALSAAFFL GEELFWQ EIL GMCIIILSGI LSSI RPTAFK QRLQSLFRQR 
301 * 

Computer analysis of this amino acid sequence gave the following results: 

15 Homology with a predicted ORF from N. meningitidis (strain A) 

ORF135 (SEP ID NO: 540) shows 99.0% identity over a 197aa overlap with an ORF (ORF135a) 
(SEP ID NO: 544) from strain A of N. meningitidis: 

10 20 30 

orf 135 . pep GTGAMLLLFYAVT I LPLATGVTLS YTS S I F 

20 HIM MM I II MINIM INN II 

orf 135a STVALGAAAVLRRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIF 
50 60 70 80 90 100 

40 50 60 70 80 90 

orf 135 .pep LAVFSFLILKERISVYTQAVLLLGFAGWLLLNPSFRSGQETAALAGLAGGAMSGWAYLK 
25 M | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | M | | M | | | | | | | | | | | | I I I I I I I I I 

O r f 1 3 5 a LAVFS FLILKERI S VYTQAVLLLG FAG WLLLNPS FRSGQETAALAGLAGGAMSGWAYLK 

110 120 130 140 150 160 

100 110 120 130 140 150 

orf 135 .pep VRELSLAGEPGWRWFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSM 

30 II II 1 1 II M 1 1 II 1 1 1 1 1 1 1 1 1 M I II 1 1 ! I I II 1 1 1 II M 1 1 1 1 1 M M II 1 1 1 

orf 135a VRELS LAGE PGWRWF YLS VTGVAMS S VWATLTGWHTLS FPS AVYLS C I GVS AL I AQLSM 

170 180 190 200 210 220 

160 170 180 190 200 

orf 13 5 . pep TRAYKVGDKFTVASLSYMTWFSALSAAFFLGEELFWQEILGMCI I ISAVFX 

35 ' || | | | | | | | | | | | | | | | | | | | | | | | | | | | | | : | | | | | | | | | | | | | | | 

orf 135a TRAYKVGDKFTVASLSYMTWFSALSAAFFLAEELFWQEILGMCIIILSGILSSIRPTAF 
230 240 250 260 270 280 



40 



orf 135a KQRLQSLFRQRX 
290 300 



The complete length PRF135a nucleotide sequence [<SEQ ID 543>] (SEP ID NO: 543) is: 



1 ATGGATACCG CAAAAAAAGA CATTTTAGGA TCGGGCTGGA TGCTGGTGGC 
51 GGCGGCCTGC TTTACCATTA TGAACGTATT GATTAAAGAG GCATCGGCAA 
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101 AATTTGCCCT CGGCAGCGGC GAATTGGTCT TTTGGCGCAT GCTGTTTTCA 

151 ACCGTTGCGC TCGGGGCTGC CGCCGTATTG CGTCGGGACA CCTTCCGCAC 

201 GCCCCATTGG AAAAACCACT TAAACCGCAG TATGGTCGGG ACGGGGGCGA 

2 51 TGCTGCTGCT .GTTTTACGCG GTAACGCATC TGCCTTTGGC CACCGGCGTT 
5 301 ACCCTGAGTT ACACCTCGTC GATTTTTTTG GCGGTATTTT CCTTCCTGAT 

3 51 TTTGAAAGAA CGGATTTCCG TTTACACGCA GGCGGTGCTG CTCCTTGGTT 

4 01 TTGCCGGCGT GGTATTGCTG CTTAATCCCT CGTTCCGCAG CGGTCAGGAA 
4 51 ACGGCGGCAC TCGCCGGGCT GGCGGGCGGC GCGATGTCCG GCTGGGCGTA 
501 TTTGAAAGTG CGCGAACTGT CTTTGGCGGG CGAACCCGGC TGGCGCGTCG 

10 551 TGTTTTACCT TTCCGTGACA GGTGTGGCGA TGTCATCGGT TTGGGCGACG 

601 CTGACCGGCT GGCACACCCT GTCCTTTCCA TCGGCAGTTT ATCTGTCGTG 

651 CATCGGCGTG TCCGCGCTGA TTGCCCAACT GTCGATGACG CGCGCCTACA 

701 AAGTCGGCGA CAAATTCACG GTTGCCTCGC TTTCCTATAT GACCGTCGTT 

751 TTTTCCGCTC TGTCTGCCGC ATTTTTTCTG GCCGAAGAGC TTTTCTGGCA 

15 801 GGAAATACTC GGTATGTGCA TCATCATCCT CAGCGGTATT TTGAGCAGCA 

851 TCCGCCCCAC TGCCTTCAAA CAGCGGCTGC AATCCCTGTT CCGCCAAAGA 

901 TAA 

This encodes a protein having amino acid sequence [<SEQ ID 544>] (SEP ID NO: 544) : 

20 1 MDTAKKD I LG SGWMLVAAA C FTIMNVLIKE ASAKFALGSG ELVFWRMLFS 

51 TVALGAAAVL RRDTFRTPHW KNHLNRS MVG TGAMLLLFYA VTHL PLATGV 

101 TLSYTSSIFL AVFSFLILKE RISVYTQAVL LLGFAGWLL LNPSFRSGQE 

151 TAALAGLAGG AMSGWAYLKV RELSLAGEPG WRWFYLSVT GVAMSSVWAT 

2 01 LTGWHTLS FP SAVYLSCIGV SALIA QLSMT RAYKVGDKFT VAS LSYMTW 

25 251 FSALSAAFFL AEELFWQ EIL GMCIIILSGI LSSI RPTAFK QRLQSLFRQR 

301 * 

ORF135a (SEP ID NO: 544) and ORF135-1 fSEO ID NO: 542) show 99.3% identity in 300 aa 
overlap: 



30 orf 135a . pep MDTAKKDILGSGWMLVAAACFTIMNVLIKEASAKFALGSGELVFWRMLFSTVALGAAAVL 

I 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 : 1 1 1 

O r f 1 3 5 - 1 MDTAKKD I LGS GWML VAAACFT I MNVL I KE AS AKFALGSGELVFWRML FS TVALGAAAVL 

orf 13 5a. pep RRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIFLAVFSFLILKE 
|: I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I II I I I M I I Ml I I 
35 orf 13 5 - 1 RRDXFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIFLAVFSFLILKE 



orf 135a . pep RISVYTQAVLLLGFAGWLLLNPSFRSGQETAALAGLAGGAMSGWAYLKVRELSLAGEPG 
I I I I I I I I I I I I I I I I I I I I I I I I I I II , I I I I I I I I I I I I I I I I I I I I I i I I I I , I I I I 
orf 135-1 RI SVYTQAVLLLGFAGWLLLNPS FRSGQETAALAGLAGGAMSGWAYLKVRELSLAGEPG 

orf 13 5a. pep WRWFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSMTRAYKVGDKFT 

40 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

orf 135-1 WRWFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSMTRAYKVGDKFT 

orf 13 5a. pep VASLS YMTWFS ALSAAFFLAEELFWQE I LGMC III LSGI LSS I RPTAFKQRLQSLFRQR 
I I I I I . I I I I I I I I I I I I I : I I I II I I II I I ' I I I I I I I I I I I I I I I I I I I I I < I I I I 
or f 1 3 5 - 1 VAS LS YMTWFSALS AAFFLGEELFWQE I LGMC III LSGI LSS I RPTAFKQRLQSLFRQR 



45 Homology with a predicted PRF from N. gonorrhoeae 
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ORF135 (SEP ID NO: 540) shows 97% identity over a 201aa overlap with a predicted ORF 
(ORF135ng) (SEP ID NO: 546) from N. gonorrhoeae: 

orfl35 pep GTGAMLLLFYAVTXLPLATGVTLSYTSSIF 30 

IIIMIIIIIIII I M = I I I I I I [ I ! I I I 
orfl35ng STVTLGAAAVLRRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLTTGVTLSYTSSIF 33 5 

orfl35 pep LAVFSFLILKERISVYTQAVLLLGFAGWLLLNPSFRSGQETAALAGLAGGAMSGWAYLK 90 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 M 

orfl35ng laVFSFLILKERISVYTQAVLLLGFAGWLLLNPSFRSGQEPAALAGLAGGAMSGWAYLK 3 95 

orfl35 pep VRELSLAGEPGWRWFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSM 150 

Mill MM II II I II III HUM II MINI! Mill I II II I II III MUM I II 

orf 13 5ng VRELS LAGE PGWRWFYLS ATGVAMSSVWATLTGWHTLS FPS AVYLSG I GVS AL I AQLSM 4 55 

orf 135 .pep TRAYKVGDKFTVASLSYMTWFSALSAAFFLGEELFWQEILGMCIIISAVF 201 

I I I I I I I I I I M I I I I M II II I II I I II I I I M I I II I I I II I I II I M I 
orf 135ng TRAYKVGDKFTVASLSYMTWFSALSAAFFLGEELFWQEILGMCIIISAAF 506 

An ORF135ng nucleotide sequence [<SEQ ID 545>] (SEP ID NO: 545) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 546>] (SEP ID NO: 546) : 

1 MPSEKAFRRH LRTAS FQGLH LHHFHQKVGK CGIIGFGIHI FPTLLPA AQG 

51 ILDIQLGLFR IDFAALAVYR RTQVDFIHTV IDGIASDQAF SEWQILRRL 

101 NLGHFTDTHL IAQARRFIAD FGNIRPMRRG EAKTFCRCFR FDGIDGIHGD 

151 FRQCGHINRL APGKDCRNGK RDKVFFHTRH YNQVCLEKTN CSARKIKFRH 

2 01 QKQAKTHSTS LAARFTIRPS LSQRPFMDTA KKDILGS GWM LVAAACFTVM 

251 NVLI KEASAK FALGSGELVF WRMLFSTVTL GAAAVLRRDT FRTPHWKNHL 

301 NRSMVGTGAN LLLFYAVTHL PLTTGVTLSY TSSIFLAVFS FLILKERISV 

351 YTQA VLLLGF AGWLLLNPS F RSGQEPAAL AGLAGGAMSG WAYLKVRELS 

4 01 LAGEPGWRW FYLSATGVAM SSVWATLTGW HTLS FPSAVY LSGIGVSALI 

451 AQLSMTRAYK VGDKFTVAS L SYMTWFSAL SAAFFL GEE L FWQEILGMCI 

501 IISAAF* 

Further work revealed the following gonococcal sequence [<SEQ ID 547>] (SEP ID NO: 547) : 

1 ATGGATACCG CAAAAAAAGA CATTTTAGGA TCGGGCTGGA TGCTGGTGGC 

51 GGCGGCCTGC TTCACCGTTA TGAACGTATT GATTAAAGAG GCATCGGCAA 

101 AATTTGCCCT CGGCAGCGGC GAATTGGTCT TTTGGCGCAT GCTGTTTTCA 

151 ACCGTTACGC TCGGTGCTGC CGCCGTATTG CGGCGCGACA CCTTCCGCAC 

201 GCCCCATTGG AAAAACCACT TAAACCGCAG TATGGTCGGG ACGGGGGCGA 

2 51 TGCTGCTGCT GTTTTACGCG GTAACGCATC TGCCTTTGAC AACCGGCGTT 

301 ACCCTGAGTT ACACCTCGTC GATTTTTttg GCGGTATTTT CCTTCCTGAT 

351 TTTGAAAGAA CGGATTTCCG TTTACACGCA GGCGGTGCTG CTCCTTGGTT 

4 01 TTGCCGGCGT GGTATTGCTG CTTAATCCCT CGTTCCGCAG CGGTCAGGAA 

4 51 CCGGCGGCAC TCGCCGGGCT GGCGGGCGGC GCGATGTCCG GCTGGGCGTA 

501 TTTGAAAGTG CGCGAACTGT CTTTGGCGGG CGAACCCGGC TGGCGCGTCG 

551 TGTTTTACCT TTCCGCAACC GGCGTGGCGA TGTCGTCggt ttgggcgacg 

601 Ctgaccggct ggCACAcccT GTCCTTTcca tcggcagttt ATCtgtCGGG 

651 CATCGGCGTG tccgcgCtgA TTGCCCAaCT GtcgatgAcg cGCGcctaca 

701 aaGTCGGCGA CAAATTCACG GTTGCCTCGC tttcctaTAt gaccgtcGTC 

751 TTTTCCGCCC TGTCTGCCGC ATTTTTTCTg ggcgaagagc tttTCtggCA 

801 GGAAATACTC GGTATGTGCA TCATTAtCCT CAGCGGCATT TTGAGCAGCA 

851 TCCGCCCCAT TGCCTTCAAA CAGCGGCTGC AAGCCCTCTT CCGCCAAAGA 

901 TAA 
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This corresponds to the amino acid sequence [<SEQ ID 548; ORF135ng-l>] (SEP ID NO: 548; 
PRF135ng-l) : 

1 MDTAKKDILG SGWMLVAAA C FTVMNVLIKE ASAKFALGSG ELVFWRMLFS 
5 51 TVTLGAAAVL RRDTFRTPHW KNHLNRS MVG TGAMLLLFYA VTHL PLTTGV 

101 TLSYTSSIFL AVFSFLILKE RISVYTQAVL LLGFAGWLL LNPSFRSGQE 
151 PAALAGLAGG AMSGWAYLKV RELSLAGEPG WRWFYLSAT GVAMSSVWAT 
201 LTGWHTLS FP SAVYLSGIGV SALIA QLSMT RAYKVGDKFT VAS LSYMTW 
251 FSALSAAFFL GEELFWQ EIL GMCIIILSGI LSSI RPIAFK QRLQALFRQR 
10 301 * 

ORF135ng-l (SEP ID NO: 548) and PRF135-1 (SEP ID NO: 542) show 97.0% identity in 300 aa 
overlap: 

orf 135ng-l .pep MDTAKKD I LGSGWMLVAAACFTVMNVL I KEASAKFALGSGELVFWRMLFS TVTLGAAAVL 

15 M 1 1 1 1 1 1 i 1 1 1 II 1 1 M 1 1 II :| 1 1 1 1 1 M I MM I i 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 M 

or f 13 5 - 1 MDTAKKDILGSGWMLVAAACFTIMNVLI KEASAKFALGSGELVFWRMLFSTVALGAAAVL 

orf 135ng- 1 . pep RRDTFRTPHWKNHLNRSMVGTGANLLLFYAVTHLPLTTGVTLSYTSS I FLAVFSFLILKE 

MM I M II II M I II II I II 1 1 II II M II M MM M 1 1 II I II II I Ml I IM M 

orf 135-1 . RRDXFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSS I FLAVFSFLILKE 

20 orf 135ng-l .pep RISVYTQAVLLLGFAGWLLLNPSFRSGQEPAALAGLAGGAMSGWAYLKVRELSLAGEPG 

Illlllllllllllllllllllllllllll I I M I I I M I I I I I I I I I I I I M I I 
orf 135-1 RISVYTQAVLLLGFAGWLLLNPSFRSGQETAALAGLAGGAMSGWAYLKVRELSLAGEPG 



orf 135ng- 1 . pep WRWF YLS ATGVAMS S VWATLTGWHTLS FPS AVYLSG I GVS AL I AQLSMTRAYKVGDKFT 

1 1 1 1 ! II M 1 1 1 M 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 M M 1 1 1 II 1 1 1 II I 1 1 1 1 1 1 1 1 1 1 1 1 

25 orf 135-1 WRWF YLS VTGVAMSSWATLTGWHTLSFPSAVYLS CI GVS AL I AQLSMTRAYKVGDKFT 

orf 135ng-l .pep VASLSYMTWFSALSAAFFLGEELFWQEILGMCIIILSGILSSIRPIAFKQRLQALFRQR 

1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 II II II 1 1 1 1 II I M 1 1 1 1 llllllhlllll 

orf 135-1 VASLSYMTWFSALSAAFFLGEELFWQEILGMCIIILSGILSSIRPTAFKQRLQSLFRQR 

30 Based on this analysis, including the presence of several putative transmembrane domains in the 
gonococcal protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 66 

The following DNA sequence was identified in ^meningitidis [<SEQ ID 549>] (SEP ID NP: 
35 549) : 

1 ATGAAGCGGC GTATAGCCGT CTTCGTCCTG TTCCCGCAGA TAATCCGAGT 

51 TTTGGGACAA CTGTTGCCGA AAATCGTCAA TACAGTTCCG GCACATCGGA 

101 TGCTCTTCCA GATTTTCGGG ATGTTCTTTT TCTTCATACA CCAGCAATAT 

151 CTGCCCGGGA TCGCCGAAAT CGATTCCCCA TGCGGCATCG TGTTCGGTGC 

40 201 GCTCCTCTTC CGTCATCTGC CCGCGCATTG CCTGTATGGT AAAGCCGCCG 
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251 TAGGGGATGC CgTTGCACAC GAACATCCAG TCGCTGATGT CGTCAACCGG 

301 AACGCAAACG cTTTCGCCTT GTTCGACATT GGTCAGTTCG CCsGGTTCAT 

351 TGTTCAGCAC ACCGTAAATA TAAAGACCGT CAAAATAAAT ATCGTCGATC 

401 CACATATGTT CGCAAATTTC GCCGTCTTCG CCGTCTTGGA AAAAAGGGAC 

5 451 TTTGACCATG GCAAAATCCA AGGCGGAAAT AATGCGGCGG CGTTCCCAAA 

501 AAAGcTCGCG CCAAAAATAT TTGAATGTTT TACGGGCGCG TTCGTCGGCA 

551 CGGTTTACCG GTTCGTCTGC CTGTTCTACA TAATAAATGA CGGAATCGCC 

601 CATCAT^TCT GCTCCTCAAC GTGTACGGTA TCTGTTTGCA CCTTACT.GCG 

651 GCTTTCTgcC kTCGGCATCC GATTCGGATT TGAAAAGTTC mmrwyATTCG 

10 701 GAATAG 

This corresponds to the amino acid sequence [<SEQ ID 550; ORF136>] (SEP ID NO: 550; 
PRF136) : 

1 MKRRIAVFVL FPQIIRVLGQ LLPKIVNTVP AHRNLFQIFG MFFFFIHQQY 

15 51 LPGIAEIDSP CGIVFGALLF RHLPAHCLYG KAAVGDAVAH EHPVADWNR 

101 NANAFALFDI GQFAXFIVQH TVNIKTVKIN IVDPHMFANF AVFAVLEKRD 

151 FDHGKIQGGN NAAAFPKKLA PKIFECFTGA FVGTVYRFVC LFYIINDGIA 1 

201 HHSAPQRVRY LFAPYCGFLP SASDSDLKSS XXSE* 

20 Further work revealed the complete nucleotide sequence [<SEQ ID 55 1 >] (SEP ID NO: 551) : 

1 ATGATGAAGC GGCGTATAGC CGTCTTCGTC CTGTTCCCGC AGATAATCCG 

51 AGTTTTGGGA CAACTGTTGC CGAAAATCGT CAATACAGTT CCGGCACATC 

101 GGATGCTCTT CCAGATTTTC GGGATGTTCT TTTTCTTCAT ACACCAGCAA 

151 TATCTGCCCG GGATCGCCGA AATCGATTCC CCATGCGGCA TCGTGTTCGG 

25 201 TGCGCTCCTC TTCCGTCATC TGCCCGCGCA TTGCCTGTAT GGTAAAGCCG 

251 CCGTAGGGGA TGCCGTTGCA CACGAACATC CAGTCGCTGA TGTCGTCAAC 

301 CGGAACGCAA ACGCTTTCGC CTTGTTCGAC ATTGGTCAGT TCGCCGGGTT 

351 CATTGTTCAG CACACCGTAA ATATAAAGAC CGTCAAAATA AATATCGTCG 

4 01 ATCCACATAT GTTCGCAAAT TTCGCCGTCT TCGCCGTCTT GGAAAAAAGG 

30 451 GACTTTGACC ATGGCAAAAT CCAAGGCGGA AATAATGCGG CGGCGTTCCC 

501 AAAAAAGCTC GCGCCAAAAA TATTTGAATG TTTTACGGGC GCGTTCGTCG 

551 GCACGGTTTA CCGGTTCGTC TGCCTGTTCT ACATAATAAA TGACGGAATC 

601 GCCCATCATT CTGCTCCTCA ACGTGTACGG TATCTGTTTG CACCTTACTG 

651 CGGCTTTCTG CCTTCGGCAT CCGATTCGGA TTTGAAAAGT TCCAAATATT 

35 701 CGGAATAG 

This corresponds to the amino acid sequence [<SEQ ID 552; ORF136-l>] (SEP ID NO: 552: 
ORF136-1) : 

1 MMKRR IAVFV LFPQIIRVLG QL LPKIVNTV PAHRMLFQIF GMFFFFIHQQ 

40 51 YLPGIAEIDS PCGIVFGALL FRHLPAHCLY GKAAVGDAVA HEHPVADWN 

101 RNANAFALFD IGQFAGFIVQ HTVNIKTVKI NIVDPHMFAN FAVFAVLEKR 

151 DFDHGKIQGG NNAAAFPKKL APKIFECFT G AFVGTVYRFV CLFYIIN DGI 

201 AHHSAPQRVR YLFAPYCGFL PSASDSDLKS SKYSE* 

45 Computer analysis of this amino acid sequence gave the following results: . 



Homology with a predicted PRF from N.meningitidis (strain A) 
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ORF136 (SEP ID NO: 550) shows 71.7% identity over a 237aa overlap with an ORF (ORF136a) 
(SEP ID NO: 554) from strain A of N. meningitidis: 

10 20 30 40 50 59 

orf 136 .pep MKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS 

5 MINI I I I I : I |:||lllll llllllllllll I I I ! I I I I I I I II I I I I I 

orf 136a MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQXFGMFFFFIHQQYLPGIAEIDS 

10 20 30 40 50 60 

60 70 80 90 100 110 119 

orf 136 . pep PCG I VFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADWNRNANAFALFD I GQFAXFI VQ 

10 MINIMUM : 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MM 

orf 13 6a PCG I VFGTLLFRHXSTHCLYGKAAVGNAVAHEHPVADWNRNANAFALFD IGQFAGFI VQ 

70 80 90 100 110 120 

120 130 140 150 160 170 179 

orf 13 6 .pep HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTG 

15 h : h U I M I I I U I I I I I I MMMI : M = |: | :: s = 

orf 136a HAINVKTVKINIVDPHMFANFAXFAVLEKRALTMAKSKXXXMRRRSQKSSRQKYLNVLRA 

130 140 150 160 170 180 

180 190 200 210 220 230 

orf 136 .pep AFVGTVYRFVCLFYI INDGIAHH SAPQRVRYLFAPYCGFLPSASDSDLKSSXXSEX 

20 : MM : M M M M M M M M M M M M M M Ml 

orf 136a R SPARFTGLSACSTXXMTESPIISAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX 

190 200 210 220 230 

The complete length ORF136a nucleotide sequence [<SEQ ID 553>] (SEP ID NO: 553) is: 

25 1 ATGATGAAGC GGCGTATAGC CGTCTTCGTC CTGCTCATGC AGAAAATCCG 

51 GATTTTGGGA CAACTGTTGC CGAAAATCGT CAATACAGTT CCGGCACATC 

101 GGATGCTCTT CCAGATNTTC GGGATGTTCT TTTTCTTCAT ACACCAGCAA 

151 TACCTGCCCG GGATCGCCGA AATCGATTCC CCATGCGGCA TCGTGTTCGG 

2 01 TACGCTCCTC TTCCGTCATC NGTCCACGCA TTGCCTGTAT GGTAAAGCCG 
30 , 2 51 CCGTAGGGAA TGCCGTTGCA CACGAACATC CAGTCGCTGA TGTCGTCAAC 

3 01 CGGAACGCAA ACGCTTTCGC CTTGTTCGAC ATTGGTCAGT TCGCCGGGTT 
351 CATTGTTCAG CACGCCATAA ATGTAAAGAC CGTCAAAATA AATATCGTCG 

4 01 ATCCACATAT GTTCGCAAAT TTCGCCNTCT TCGCCGTCTT GGAAAAAAGG 
4 51 GCTTTGACCA TGGCAAAATC TAAGGNGNNA NNGATGCGGC GGCGTTCCCA 

35 501 AAAAAGCTCG CGCCAAAAAT ATTTGAATGT TTTGCGGGCG CGTTCGCCGG 

551 CACGGTTTAC CGGTTTGTCT GCCTGTTCTA CATAATAAAT GACGGAATCG 
601 CCCATCATAT CTGCTCCTCA ACGTGTACGG TATCTGTTTG CACCTTACTG 
651 CGGCTTTCTG CCTTCGGCAT CCGATTCGGA TTTGAAAAGT TCCAAATATT 
701 CGGAATAG 

40 This encodes a protein having amino acid sequence [<SEQ ID 554>] (SEP ID NO: 554) : 

1 MMKRR IAVFV LLMQKIRILG QL LPKIVNTV PAHRMLFQXF GMFFFFIHQQ 

51 YLPGIAEIDS PCGIVFGTLL FRHXSTHCLY GKAAVGNAVA HEHPVADWN 

101 RNANAFALFD IGQFAGFI VQ HAINVKTVKI NIVDPHMFAN FAXFAVLEKR 

151 ALTMAKS KXX XMRRRSQKSS RQKYLNVLRA RSPARFTGLS ACST**MTES 

45 201 PIISAPQRVR YLFAPYCGFL PSASDSDLKS SKYSE* 

ORF136a (SEP ID NP: 554) and PRF136-1 (SEP ID NP: 552) show 73.1% identity in 238 aa 



overlap: 
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10 20 30 40 50 60 

orf 13 6a. pep MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQXFGMFFFFIHQQYLPGIAEIDS 

MM Mllh I 1 : 1 1 1 1 1 1 1 1 1 1 II IIMIIIMIIMMIM 

orf 136-1 MMKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS 
5 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 136a . pep PCGIVFGTLLFRHXSTHCLYGKAAVGNAVAHEHPVADWNRNANAFALFDIGQFAGFIVQ 

MMMMIMI M Ml 1 1 1 I Ml 1 1 1 1 1 M 1 1 M I M II M M I M 1 1 M I M 1 1 

orf 136 - 1 PCGIVFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADWNRNANAFALFDIGQFAGFIVQ 
10 70 80 90 100 110 120 

130 140 150 160 170 180 

orf 136a . pep HAINVKTVKINIVDPHMFANFAXFAVLEKRALTMAKSKXXXMRRRSQKSSRQKYLNVLRA 

I = : I : , M I i ! I I I MUM : M : |: | :: : : 

orf 136-1 HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTG 
15 130 140 150 160 170 180 

190 200 210 220 230 

orf 13 6a. pep R SPARFTGLSACSTXXMTESPIISAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX 

: 11= I : ::: I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

orf 13 6 - 1 AFVGTVYRFVCLFYI INDGIAHH SAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX 

20 190 200 210 220 230 

Homology with a predicted ORF from N. gonorrhoeae 

ORF136 (SEP ID NO: 550) shows 92.3% identity over a 234aa overlap with a predicted ORF 
(ORF136ng) (SEP ID NO: 556) from N. gonorrhoeae: 

orf 136 .pep MKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS 59 

25 1 1 1 1 1 M 1 1 1 : I I h 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h 1 1 1 1 1 1 1 1 1 1 1 

orf 136ng MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQIFGMFFFFIHRQYLPGIAEIDS 60 

orf 136 .pep PCGI VFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADWNRNANAFALFD I GQFAXF I VQ 119 

I MIMIIMI I MIIIMI IMMIIMMMIMMMMIMM I 1 1 1 1 

orf 1 3 6ng PGGI VFGTLLFRHLSAHCLYGKAAVGDAVAHEHPVADVANRNANAFALFDIGQSAGFI VQ 120 

30 orf 136 .pep HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTG 179 

I I M II 1 1 II 1 1 1 1 1 1 M M II 1 1 1 1 1 II 1 1 1 M M 1 1 1 1 1 1 1 M M i I II Ml II 1 1 

orf 13 6ng HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKVFECFTG 180 

orf 136 .pep AFVGTVYRFVCLFYI INDG I AHHSAPQRVRYLFAPYCGFLPSASDSDLKSSXXSE 234 

I I : I I i I I I I I I I I I I I I : I I I I I I I I I I I I I I II 

35 orf 1 3 6ng AFAGTVYRFVCLFY 1 1 NDG I AHHTAPQRVRYLFAP YRGFLP PASDSDLKS S KYS E 235 

The complete length PRF136ng nucleotide sequence [<SEQ ID 555>] fSEPIDNP: 555) is: 

1 ATGATGAAGC GGCGTATAGC CGTCTTCGTC CTGCTCATGC AGAAAATCCG 

51 GATTTTGGGA CAACTGTTGC CGAAAATCGT CAATACAGTT CCGGCACATC 

40 101 GGATGCTCTT CCAAATTTTC GGGATGTTCT TTTTCTTCAT ACACCGGCAA 

151 TACCTGCCCG GGATCGCCGA AATCGATTCC CCAGGCGGTA TCGTGTTCGG 

201 TACGCTCCTC TTCCGTCATC TGTCCGCGCA TTGCCTGTAC GGTAAAGCCG 

251 CCGTAGGGGA TGCCGTTGCA CACGAACATC CAGTCGCTGA TGTCGCCAAC 

301 CGGAACGCAA ACGCTTTCGC CTTGTTCGAC ATTGGTCAGT CCGCCGGGTT 

45 351 CATTGTTCAG CACACCGTAA ATATAAAGAC CGTCAAAATA AATATCGTCG 
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4 01 ATCCACATAT GTTCGCAAAT TTCGCCGTCT TCGCCGTCTT GGAAAAAAGG 

4 51 GACTTTGACC ATGGCAAAAT CCAAGGCGGA AATAATGCGG CGGCGTTCCC 

501 AAAAAAGCTC GCGCCAAAAG TATTTGAATG TTTTACGGGC GCGTTCGCCG 

551 GCACGGTTTA CCGGTTCGTC TGCCTGTTCT ACATAATAAA TGACGGAATC 

5 601 GCCCATCATA CTGCTCCTCA ACGTGTACGG TATCTGTTTG CACCTTACCG 

651 CGGTTTTCTA CCTCCGGCAT CCGATTCGGA TTTGAAAAGT TCCAAATATT 

701 CGGAATAG 

This encodes a protein having amino acid sequence [<SEQ ID 556>] (SEP ID NO: 556) : 

10 1 MMKRR IAVFV LLMQKIRILG QL LPKIVNTV PAHRMLFQIF GMFFFFIHRQ 

51 YLPGIAEIDS PGGIVFGTLL FRHLSAHCLY GKAAVGDAVA HEHPVADVAN 

101 RNANAFALFD IGQSAGFIVQ HTVNIKTVKI NIVDPHMFAN FAVFAVLEKR 

151 DFDHGKIQGG NNAAAFPKKL APKVFECFT G AFAGTVYRFV CLFYII NDGI 

201 AHHTAPQRVR YLFAPYRGFL PPASDSDLKS SKYSE* 

15 

ORF136ng (SEP ID NO: 556) and ORF136-1 (SEP ID NO: 552) show 93.6% identity in 235 aa 
overlap: 

orf 13 6ng MM KRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQ I FGMFFFF I HRQ YLPGIAEIDS 

IIIIIMIIIh I I f ^ M 1 1 1 1 i I I I I 1 1 I I 1 M I I 1 I I : .IM I 

20 orf 136-1. MMKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS 

orf 13 6ng PGG I VFGTLLFRHLSAHCLYGKAAVGDAVAHEHPVADVANRNANAFALFD IGQSAGFIVQ 

I llllhllllll 1 1 1 1 1 1 1 1 1 1 1 U 1 1 1 1 1 ihl 1 1 1 M 1 1 1 1 1 1 II llllll 

orf 13 6 - 1 PCGIVFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADWNRNANAFALFDIGQFAGFIVQ 

orf 13 6ng HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKVFECFTG 
25 I | | I I I M I I I I I I I I I I I I I I I I I I M I I I I hi I I I I I I I I I I M I I I I II : I I I I 

orf 136-1 HTWIKTVKINIVDPHMFANFAVFAVLEKIIDFDHGKIQGGNNAAAFPKKLAPKIFECFTG 

orf 136ng AFAGTVYRFVCLFYIINDGIAHHTAPQRVRYLFAPYRGFLPPASDSDLKSSKYSEX 

I I : M I I ' I I I I I I : I I I I t I I I I , I I I I I I I I I I I I I I I I I I I 

orf 136 - 1 AFVGTVYRFVCLFYIINDGIAHHSAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX 



30 



Based on the presence of the putative transmembrane domains in the gonococcal protein, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 67 



35 The following partial DNA sequence was identified in K meningitidis [<SEQ ID 557>] (SEP ID 
NP: 557) : 

1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGACCGCTTT TGGCAATCGC 

51 CGCCGCCGCG TTGCTTGCCG CC.TGCGGAC GGCGGGAAAT AATGCTGTCC 

101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGG TTTGGCACTC 

40 151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT GTAGGTATTA TTAAGGTTTT 

201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACC TCCGCAGGTT 

251 CGATTGTCGG CAACCTTTTT GCATCGGGTA TGTCGCCCGA CCGCCTCGAA 
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301 TTGGAAGCCG AAATTTTAGG CAAAACCGAT TTGGTCGATT TAACCTTGTC 

351 CACCAATGGG TTTATCAAAG GCGCAAAGCT GCAAAATTAC ATCAACCGAA 

4 01 AACTCCGCGG CATGCAGATT CAGCAGTTTC CCATCAAATT TGCCGCC . . 

This corresponds to the amino acid sequence [<SEQ ID 558; ORF137>] fSEO ID NO: 558; 
PRF137) : 

1 MENMVTFSKI RPLLAIAAAA LLAAXRTAGN NAVRKPVQTA KPAAWGLAL 

51 GGGASKGFAH VGIIKVLKEN GIPVKWTGT SAGSIVGNLF ASGMSPDRLE 

101 LEAEILGKTD LVDLTLSTNG FIKGAKLQNY INRKLRGMQI QQFPIKFAA. . 

Further work revealed the complete nucleotide sequence [<SEQ ID 559>] (SEP ID NO: 559) : 

1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGACCGCTTT TGGCAATCGC 

51 CGCCGCCGCG TTGCTTGCCG CCTGCGGCAC GGCGGGAAAT AATGCTGTCC 

101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGG TTTGGCACTC 

151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT GTAGGTATTA TTAAGGTTTT 

201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACA TCGGCAGGTT 

251 CGATTGTCGG CAGCCTTTTT GCATCGGGTA TGTCGCCCGA CCGCCTCGAA 

301 TTGGAAGCCG AAATTTTAGG CAAAACCGAT TTGGTCGATT TAACCTTGTC 

351 CACCAGTGGT TTTATCAAAG GCGAAAAGCT GCAAAATTAC ATCAACCGAA 

4 01 AAGTCGGCGG CAGGCAGATT CAGCAGTTTC CCATCAAATT TGCCGCCGTT 

4 51 GCTACTGATT TTGAAACCGG CAAGGCCGTC GCTTTCAATC AGGGGAATGC 

501 CGGGCAGGCT GTGCGCGCTT CCGCCGCCAT TCCCAATGTG TTCCAACCCG 

551 TTATCATCGG CAGGCATACA TATGTTGACG GCGGTCTGTC GCAGCCCGTG 

601 CCCGTCAGTG CCGCCCGGCG GCAGGGGGCG AATTTCGTGA TTGCCGTCGA 

651 TATTTCCGCC CGTCCGGGCA AAAACATCAG CCAAGGTTTC TTCTCTTATC 

701 TCGATCAGAC GCTGAACGTA ATGAGCGTTT CTGCGTTGCA AAATGAGTTG 

751 GGGCAGGCGG ATGTGGTTAT CAAACCGCAG GTTTTGGATT TGGGTGCAGT 

801 CGGCGGATTC GATCAGAAAA AACGCGCCAT CCGGTTGGGT GAGGAGGCAG 

851 CACGTGCCGC ATTGCCTGAA ATCAAACGCA AACTGGCGGC ATACCGTTAT 

901 TGA 



This corresponds to the amino acid sequence [<SEQ ID 560; ORF137-l>] (SEP ID NO: 560; 
PRF137-1) : 



1 MENMVTFSKI RPLLAIAAAA LLAA CGTAGN NAVRKPVQTA KPAAWGLAL 

51 GGGASKGFAH VGIIKVLKEN GIPVKWTGT SAGSIVGSLF ASGMSPDRLE 

101 LEAEILGKTD LVDLTLSTSG FIKGEKLQNY INRKVGGRQI QQFPIKFAAV 

151 ATDFETGKAV AFNQGNAGQA VRASAAIPNV FQPVI IGRHT YVDGGLSQPV 

201 PVSAARRQGA NFVIAVDISA RPGKNISQGF FSYLDQTLNV MSVSALQNEL 

251 GQADWIKPQ VLDLGAVGGF DQKKRAIRLG EEAARAALPE I KRKLAAYRY 

301 * 



Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N .meningitidis (strain A) 



PRF137 (SEP ID NP: 558) shows 93.3% identity over a 149aa overlap with an PRF (PRF137a) 
(SEP ID NP: 562) from strain A of N. meningitidis: 
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10 20 30 40 50 60 

orf 137 .pep MENMVTFSKIRPLLAIAAAALLAAXRTAGNNAVRKPVQTAKPAAVVGLALGGGASKGFAH 

MINIUM MM I II II II II I 1 1 M M 1 1 1 II I ; M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 

orf 137a MENMVTFSKIRPLLAIAAAALLAACGTAGNNAARKPVQTAKPAAVVGIiALGGGASKGFAH 
5 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 137 .pep VGI IKVLKENGI PVKWTGTSAGS IVGNLFASGMSPDRLELEAEILGKTDLVDLTLSTNG 

1 1 1 M 1 1 M II 1 1 1 1 II 1 1 1 1 II M 1 1 • II M 1 1 II M 1 1 1 M 1 1 II I M I M 1 1 M h I 

orf 137a VGI I KVLKENG I PVKWTGTSAGS I VGSL FAS GMS PDRLELEAEI LGKTDLVDLTLSTSG 

10 70 80 90 100 110 120 

130 140 149. 

orf 137 .pep F I KGAKLQNY INRKLRGMQ IQQFP I KFAA 

MM IMIIIIIh I MIIIIIIMI 

orf 137a FI KGEKLQNYINRKVGGRRIQQFPI KFAAVATDFETGKAVAFNQGNAGQAVRASAAI PNV 

15 130 140 150 160 170 180 

The complete length ORF137a nucleotide sequence [<SEQ ID 561>] (SEP ID NO: 561) is: 

1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGACGGCTTT TGGCAATCGC 

51 CGCCGCCGCG TTGCTTGCCG CCTGCGGCAC GGCGGGAAAT AATGCTGCCC 

20 101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGG TTTGGCACTC 

151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT GTAGGTATTA TTAAGGTTTT 

2 01 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACA TCGGCAGGTT 

2 51 CGATAGTCGG CAGCCTTTTT GCATCGGGTA TGTCGCCCGA CCGCCTCGAA 

3 01 TTGGAAGCCG AAATTTTAGG TAAAACCGAT TTGGTCGATT TAACCTTGTC 
25 351 CACCAGTGGT TTTATCAAAG GCGAAAAGCT GCAAAATTAC ATCAACCGAA 

4 01 AAGTCGGCGG CAGGCGGATT CAGCAGTTTC CCATCAAATT TGCCGCCGTT 
4 51 GCTACTGATT TTGAAACCGG CAAGGCCGTC GCTTTCAATC AAGGGAATGC 
501 CGGGCAGGCT GTGCGCGCTT CCGCCGCCAT TCCCAATGTG TTCCAACCCG 
551 TTATCATCGG CAGGCATACA TATGTTGACG GCGGTCTGTC GCAGCCCGTG 

30 601 CCCGTCAGTG CCGCCCGGCG GCANGNNNNG NATNTCGTGA TTGCCGTCGA 

651 TATTTCCGCC CGTCCGAGCA AAAACATCAG CCAAGGCTTC TTCTCTTATC 

701 TCGATCAGAC GCTGAACGTA ATGAGCGTTT CCGCGTTGCA AAATGAGTTG 

751 GGGCAGGCGG ATGTGGTTAT CAAACCGCAG GTTTTGGATT TGGGTGCAGT 

801 CGGCGGATTC GATCAGAAAA AACGCGCCAT CCGGTTGGGT GAGGAGGCAG 

35 851 CACGTGCCGC ATTGCCTGAA ATCAAACGCA AACTGGCGGC ATACCGTTAT 

901 TGA 

This encodes a protein having amino acid sequence [<SEQ ID 562>] (SEP ID NO: 562) : 



1 MENMVTFSKI RPLLAIAAAA LLAA CGTAGN NAARKPVQTA KPAAWGLAL 

40 51 GGGASKGFAH VGIIKVLKEN GIPVKWTGT SAGSIVGSLF ASGMS PDRLE 

101 LEAEILGKTD LVDLTLSTSG FIKGEKLQNY INRKVGGRRI QQFPIKFAAV 

151 ATDFETGKAV AFNQGNAGQA VRASAAIPNV FQPVIIGRHT YVDGGLSQPV 

201 PVSAARRXXX XXVIAVDISA RPSKNISQGF ' FSYLDQTLNV MSVSALQNEL 

251 GQADWIKPQ VLDLGAVGGF DQKKRAIRLG EEAARAALPE IKRKLAAYRY 

45 301 * 

ORF137a (SEP ID NP: 562) and PRF137-1 (SEP ID NP: 560) show 97.3% identity in 300 aa 
overlap: 



50 



orf 137a .pep 
orf 137-1 



MENM VT F S K I RP LLA I AAAALLAACGTAGNN AARKP VQTAKP AA VVGLALGGGAS KG F AH 

MINI Ulllllllll IIIIIIIIIIIMIIMIIMIII MINIMUM 

MENMVTFSKIRPLLAIAAAALLAACGTAGNNAVRKPVQTAKPAAWGLALGGGASKGFAH 
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orfl37a.pep VGII KVLKENG I PVKWTGTSAGS I VGS LFASGMS PDRLELEAE I LGKTDLVDLTLSTSG 

II IMMIIIIIIIIIIIIIIIIIMIIIIMIIIIMI IIMIIIIIIIIII! MINI 

orfl37-l VGII KVLKENG I PVKWTGTSAGS I VGSLFASGMS PDRLELEAE I LGKTDLVDLTLSTSG 

orf 137a pep FIKGEKLQNYINRKVGGRRIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV 

5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II I 

orf 137-1 F I KGE KLQN Y I NRKVGGRQ I QQ F P I KFAAVATD FETGKAVAFNQGNAGQAVRAS AA I PNV 

orf 137a. pep FQPVIIGRHTYVDGGLSQPVPVSAARRXXXXXVIAVDISARPSKNISQGFFSYLDQTLNV 

II II MM I II II II II MM I II Ml I M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 137-1 FQPVIIGRHTYVDGGLSQPVPVSAARRQGANFVIAVDISARPGKNISQGFFSYLDQTLNV 
10 orf 137a . pep MSVSALQNELGQADWIKPQVLDLGAVGGFDQKKRAIRLGEEAARAALPEIKRKLAAYRY 

II II M II 1 1 1 II M 1 1 II 1 1 1 II 1 1 1 1 1 1 II 1 1 1 II M II 1 1 1 1 1 1 M I II II II II M 

or f 1 3 7 - 1 MS VSALQNELGQADW I KPQVLDLGAVGGFDQKKRA I RLGEEAARAAL PE I KRKLAAYRY 

Homology with a predicted ORF from N. gonorrhoeae 

ORF137 (SEP ID NO: 558) shows 89.9% identity over a 149aa overlap with a predicted ORF 
15 fPRF137ng) (SEP ID NO: 564) from N. gonorrhoeae: 

orf 137 .pep MENMVTFSKIRPLLAIAAAALLAAXRTAGNNAVRKPVQTAKPAAWGLALGGGASKGFAH 60 

IIIIIMMII MIMMMIM II 1 1 ! M 1 1 1 1 II 1 1 M 1 1 - 1 1 i 1 1 1 1 1 1 II 

orf 1 3 7ng MENMVTFSKIRS FLA I AAAALLAACGTAGNNAARKPVQTAKPAAWALALGGGAS KGFAH 6 0 

or f 13 7 . pep VG 1 1 KVLKENG I PVKWTGTSAGS I VGNL FAS GMS PDRLELEAE I LGKTDLVDLTLS TNG 12 0 

20 ■ : | | : | | || || | || | || | || | | || || I I : h II I I II I II II I II I I I I I I II II I I I M I 

orf 137ng IGIVKVLKENGI PVKWTGTSAGS I VGS LLAS GMS PDRLELEAE I LGKTDLVDLTLSTSG 12 0 

orf 13 7 . pep F I KGAKLQN Y I NRKLRGMQ I QQ FP I KFAA 14 9 

MM 1)1111111= I MIMIIMM 
orfl37ng FI KGEKLQNY I NRKVGGRQ IQQFP I KFAAVATD FETGKAVAFNQGNAGQAVRAS AAI PNV 180 

25 

The complete length PRF137ng nucleotide sequence [<SEQ ID 563>] (SEP ID NP: 563) is: 



1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGATCATTTT TGGCAATCGC 

51 CGCCGCCGCG TTGCTTGCCG CCTGCGGTAC GGCGGGAAAC AATGCCGCCC 

101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGC TTTGGCACTC 

30 151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT ATAGGAATTG TTAAGGTTTT 

201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACA TCGGCAGGTT 

251 CGATAGTCGG CAGCCTTTTG GCATCGGGTA TGTCGCCCGA CCGCCTCGAA 

3 01 TTGGAAGCCG AGATTTTAGG TAAAACCGAT TTAGTCGATT TAACCTTGTC 
351 CACCAGTGGT TTTATCAAAG GCGAAAAGCT GCAAAATTAC ATCAACCGAA 

35 4 01 AAGTCGGCGG CAGGCAGATT CAGCAGTTTC CCATCAAATT TGCCGCCGTT 

4 51 GCCACTGATT TTGAAACCGG CAAGGCCGTC GCTTTCAATC AAGGGAATGC 
501 CGGGCAGGCG GTTCGTGCTT CCGCCGCCAT TCCCAATGTG TTCCAGCCAG 
551 TCATCATCGG CAGGCACAAA TATGTTGACG GCGGTCTGTC GCAGCCCGTG 
601 CCCGTCAGTG CCGCTCGGCG GCAGGGGGCG AATTTCGTGA TTGCCGTCGA 

40 651 TATTTCCGCA CGTCCGAGCA AAAATGTCGG TCAAGGTTTC TTCTCTTATC 

701 TCGATCAGAC GCTGAACGTG ATGAGCGTTT CCGTGTTGCA AAACGAGTTG 

751 gggcAGGCGG ATGTGGTTAT CAAACCGCag gtTTTGGATT TGGGTGCAGT 

801 CGGCGGATTC GATCAGAAAA AGCGCGCCAT CCGGTTGGGC GAGGAGGCAG 

851 CACGTGCCGC ATTGCCTGAA ATCAAACGCA AACTGGCGGC ATACCGTTAT 

45 901 TGA 



CHIR-0160 (356.001) 



-420- 



PATENT 



This encodes a protein having amino acid sequence [<SEQ ID 564>] (SEP ID NO: 564) : 

1 MENMVTFS KI RSFLAIAAAA LLAAC GTAGN NAARKPVQTA KPAAWALAL 

51 GGGASKGFAH IGIVKVLKEN GIPVKWTGT SAGSIVGSLL ASGMSPDRLE 

101 LEAEILGKTD LVDLTLSTSG FIKGEKLQNY INRKVGGRQI QQFPIKFAAV 

5 151 ATDFETGKAV AFNQGNAGQA VRASAAIPNV FQPVIIGRHK YVDGGLSQPV 

2 01 PVSAARRQGA NFVIAVDISA RPSKNVGQGF FSYLDQTLNV MSVSVLQNEL 

2 51 GQADWIKPQ VLDLGAVGGF DQKKRAIRLG EEAARAALPE I KRKLAAYRY 

301 * 

10 ORF137ng (SEP ID NO: 564) and ORF137-1 (SEP ID NO: 560) show 96.0% identity in 300 aa 
overlap: 

orf 13 7ng MENMVTFS KI RS FLAI AAAALLAACGTAGNNAARKPVQTAKPAAVVALALGGGAS KGFAH 

lllllllllll : I I I I I I I I I I I I I I M I I I : I I I I I I I I I I M h I I II I I I I I I I I I 
or f 13 7 - 1 MENMVTFSKIRPLLAI AAAALLAACGTAGNNAVRKPVQTAKPAAWGLALGGG AS KGFAH 

15 orf 13 7ng IGIVKVLKENGIPVKWTGTSAGSIVGSLLASGMSPDRLELEAEILGKTDLVDLTLSTSG 

:||:|IIMIII IIIIMIIIIMIIIhlllillll IIIIMIIMII IMIIM 
orf 137-1 VGIIKVLKENGIPVKWTGTSAGSIVGSLFASGMSPDRLELEAEILGKTDLVDLTLSTSG 

orf 137ng FI KGEKLQNY INRKVGGRQI QQFP I KFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV 

I I I I I I ! I I I I I I I I I I M I I I I I II I i I I I I I I I M I I I I I I I I I I M I I I I I I I 
20 orf 13 7-1 F I KGEKLQNY INRKVGGRQI QQFP I KFAAVATDFETGKAVAFNQGNAGQAVRASAAI PNV 

orf 13 7ng FQPVIIGRHKYVDGGLSQPVPVSAARRQGANFVIAVDISARPSKNVGQGFFSYLDQTLNV 

Illllllll I I I I I M I I I M I I I I I I I M II I II I I I M h : I I I I I M II I I I 
orf 137-1 FQPVIIGRHTYVDGGLSQPVPVSAARRQGANFVIAVDISARPGKNISQGFFSYLDQTLNV 

orf 13 7ng MS VS VLQNELGQADWI KPQVLDLGAVGGFDQKKRA I RLGEEAARAALPE I KRKLAAYRY 

25 | | | | : | | | | | | | | | | | | | | | | | | | | || | | | | | | | || | | | | | | | | | | I II I I I I I II I I II 

orf 137 MSVSALQNELGQADWI KPQVLDLGAVGGFDQKKRA I RLGEEAARAALPE I KRKLAAYRY 

Based on the presence of a predicted prokaryotic membrane lipoprotein lipid attachment site 
(underlined) in the gonococcal protein, it is predicted that the proteins from N. meningitidis and 
30 N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
raising antibodies. 



Example 68 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 565>] (SEP ID 
NP: 565) : 



35 1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA 

51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGcTG CCGCTTTCCT 

101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA 

151 AAGGAAGACC GCGCGCGCAT CGTCGCCmAT ATGCGGCAGG CGGGTTTGAA 

201 CCCCGACCCC AAAACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAAGGCG 

40 251 GTTTGGAACT TGCCCCCGCG TTTTTCAGAA AACCGGAAGA CATAGAAACA 
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301 ATGTTCAAAG CGGTACACGG CTGGGAACAT GTGCAGCAGG CTTTGGACAA 
3 51 ACACGAAGGG CTGCTATTC . . 

This corresponds to the amino acid sequence [<SEQ ID 566; ORF138>] (SEP ID NO: 566; 
ORF138) : 

1 MFRLQFRLFP PLRTAMHILL TALLKCLSLL PLSCLHTLGN RLGHLAFYLL 
51 KEDRARIVAX MRQAGLNPDP KTVKAVFAET AKGGLELAPA FFRKPEDIET 
101 MFKAVHGWEH VQQALDKHEG LLF 

Further work revealed the complete nucleotide sequence [<SEQ ID 567>] (SEP ID NO: 567) : 

1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA 

51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGCTG CCGCTTTCCT 

101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA 

151 AAGGAAGACC GCGCGCGCAT CGTCGCCAAT ATGCGGCAGG CGGGTTTGAA 

201 CCCCGACCCC AAAACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAAGGCG 

■ 251 GTTTGGAACT TGCCCCCGCG TTTTTCAGAA AACCGGAAGA CATAGAAACA 

3 01 ATGTTCAAAG CGGTACACGG CTGGGAACAT GTGCAGCAGG CTTTGGACAA 

3 51 ACACGAAGGG CTGCTATTCA TCACGCCGCA CATCGGCAGC TACGATTTGG 

4 01 GCGGACGCTA CATCAGCCAG CAGCTTCCGT TCCCGCTGAC CGCCATGTAC 
4 51 AAACCGCCGA AAATCAAAGC GATAGACAAA ATCATGCAGG CGGGCAGGGT 
501 TCGCGGCAAA GGAAAAACCG CGCCTACCAG CATACAAGGG GTCAAACAAA 
551 TCATCAAAGC CCTGCGTTCG GGCGAAGCAA CCATCGTCCT GCCCGACCAC 
601 GTCCCCTCCC CTCAAGAAGG CGGGGAAGGC GTATGGGTGG ATTTCTTCGG 
651 CAAACCTGCC TATACCATGA CGCTGGCGGC AAAATTGGCA CACGTCAAAG 
701 GCGTGAAAAC CCTGTTTTTC TGCTGCGAAC GCCTGCCTGG CGGACAAGGT 
751 TTCGATTTGC ACATCCGCCC CGTCCAAGGG GAATTGAACG GCGACAAAGC 
8 01 CCATGATGCC GCCGTGTTCA ACCGCAATGC CGAATATTGG ATACGCCGTT 
851 TTCCGACGCA GTATCTGTTT ATGTACAACC GCTACAAAAT GCCGTAA 

This corresponds to the amino acid sequence [<SEQ ID 568; ORF138-l>] (SEP ID NO: 568; 
PRF138-1) : 

1 MFRLQFRLFP PLRTAMH ILL TALLKCLSLL PLSC LHTLGN RLGHLAFYLL 

51 KEDRARIVAN MRQAGLNPDP KTVKAVFAET AKGGLELAPA FFRKPEDIET 

101 MFKAVHGWEH VQQALDKHEG LLFITPHIGS YDLGGRYISQ QLPFPLTAMY 

151 KPPKIKAIDK IMQAGRVRGK GKTAPTSIQG VKQIIKALRS GEATIVLPDH 

"201 VPSPQEGGEG VWVDFFGKPA YTMTLAAKLA HVKGVKTLFF CCERLPGGQG 

251 FDLHIRPVQG ELNGDKAHDA AVFNRNAEYW IRRFPTQYLF MYNRYKMP* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted PRF from N.meningitidis (strain A) 

PRF138 (SEP ID NP: 566) shows 99.2% identity over a 123aa overlap with an PRF (PRF138a) 
(SEP ID NP: 570) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 13 8 .pep MFRLQFRLFP PLRTAMH I LLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRAR I VAX 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 H I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
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orfl38a MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAN 

10 20 30 40. 50 60 

70 80 90 100 110 120 

orf 138 .pep MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG 

5 IMIII lllllll! IMIIII IIMIIIIIIIIIIIIIMMIIIIIIMIIMMMM 

or f 13 8a MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG 

70 80 90 100 110 120 

orf 13 8. pep LLF 

io in 

orf 13 8a LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG 

130 t 140 150 160 170 180 

The complete length ORF138a nucleotide sequence [<SEQ ID 569>] (SEP ID NO: 569) is: 

15 1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA 

51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGCTG CCGCTTTCCT 

101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA 

151 AAGGAAGACC GCGCGCGCAT CGTCGCCAAT ATGCGTCAGG CAGGCATGAA 

201 TCCCGACCCC AAAACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAAGGCG 

20 251 GTTTGGAACT TGCCCCCGCG TTTTTCAGAA AACCGGAAGA CATAGAAACA 

301 ATGTTCAAAG CGGTACACGG CTGGGAACAT GTGCAGCAGG CTTTGGACAA 

351 ACACGAAGGG CTGCTATTCA TCACGCCGCA CATCGGCAGC TACGATTTGG 

401 GCGGACGCTA CATCAGCCAG CAGCTTCCGT TCCCGCTGAC CGCCATGTAC 

451 AAACCGCCGA AAATCAAAGC GATAGACAAA ATCATGCAGG CGGGCAGGGT 

25 501 TCGCGGCAAA GGAAAAACCG CGCCTACCAG CATACAAGGG GTCAAACAAA 

551 TCATCAAAGC CCTGCGTTCG GGCGAAGCAA CCATCGTCCT GCCCGACCAC 

601 GTCCCCTCCC CTCAAGAAGG CGGGGAAGGC GTATGGGTGG ATTTCTTCGG 

651 CAAACCTGCC TATACCATGA CGCTGGCGGC AAAATTGGCA CACGTCAAAG 

701 GCGTGAAAAC CCTGTTTTTC TGCTGCGAAC GCCTGCCTGG CGGACAAGGT 

30 751 TTCGATTTGC ACATCCGCCC CGTCCAAGGG GAATTGAACG GCGACAAAGC 

801 CCATGATGCC GCCGTGTTCA ACCGCAATGC CGAATATTGG ATACGCCGTT 

851 TTCCGACGCA GTATCTGTTT ATGTACAACC GCTACAAAAT GCCGTAA 

This encodes a protein having amino acid sequence [<SEQ ID 570>] (SEP ID NO: 570) : 

35 1 MFRLQFRLFP PLRTAMH ILL TALLKCLSLL PLS CLHTLGN RLGHLAFYLL 

51 KEDRARIVAN MRQAGLNPDP KTVKAVFAET AKGGLELAPA FFRKPEDIET 

101 MFKAVHGWEH VQQALDKHEG LLFITPHIGS YDLGGRYISQ QLPFPLTAMY 

151 KPPKIKAIDK IMQAGRVRGK GKTAPTSIQG VKQIIKALRS GEATIVLPDH 

2 01 VPSPQEGGEG VWVDFFGKPA YTMTLAAKLA HVKGVKTLFF CCERLPGGQG 

40 251 FDLHIRPVQG ELNGDKAHDA AVFNRNAEYW IRRFPTQYLF MYNRYKMP* 

ORF138a (SEP ID NO: 570) and ORF138-1 (SEP ID NO: 568) show 99.7% identity over a 298aa 
overlap: 

orf 13 8a. pep MFRLQFRLFP PLRTAMH I LLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRAR IVAN 

45 | | | | | | | | | | | | | | | | | | | | | M | | | | | | | | | | | | | | | | | M | | | | | | | | | I I I I I I I I I 

orf 13 8 - 1 MFRLQFRLFP PLRTAMH I LLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARI VAN 

or f 13 8a . pep MRQAGMNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG 

1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 M I II II M M ; 1 1 1 1 i I II M 1 1 1 i 

orf 138-1 MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG 
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orf 13 8a . pep LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG 

MIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMMIIIIIIIIIII 

orf 138-1 LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG 
orf 138a . pep VKQIIKALRSGEATIVLPDHVPSPQEGGEGVWVDFFGKPAYTMTLAAKLAHVKGVKTLFF 

5 I II I III I III 1 1 II III II II III MM III I III Illllllll I III 1111 1 II MM 

orf 138-1 VKQIIKALRSGEATIVLPDHVPSPQEGGEGVWVDFFGKPAYTMTLAAKLAHVKGVKTLFF 
orf 138a . pep CCERLPGGQGFDLHIRPVQGELNGDKAHDAAVFNRNAEYWIRRFPTQYLFMYNRYKMP 

I II II Illllllll 1 1 MM II II M 1 1 1 1 1 M 1 1 1 1 II II 1 1 1 1 1 Illllllll II 

orf 138-1 CCERLPGGQGFDLH I RP VQGELNGDKAHDAAVFNRNAEYW I RRFPTQYLFMYNRYKMP 

10 Homology with a predicted ORF from N. gonorrhoeae 

ORF138 (SEP ID NO: 566) shows 94.3% identity over a 123aa overlap with a predicted ORF 
(ORF1 38ng) (SEP ID NO: 572) from N. gonorrhoeae: 

or f 13 8 . pep MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARI VAX 6 0 

I I I I II I II I II I I II I I I II I I I II I I I I I I M II I I I I I I I I I I I I I I I II I I I II 
1 5 orf 138ng MFRLQFRLFPPLRTAMHILLTALLKCLSLLSLSCLHTLGNRLGHLAFYLLKEDRARIVAN 60 

orf 13 8 . pep MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG 120 

Illllllll MINIMUM 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 M 1 1 1 1 1 1 1 M I i 1 1 M - II 

or f 1 3 8ng MRQAGLNPDTQTVKAVFAETAKCGLELAPAFFKKPED I ETMFKAVHGWEHVQQALDKGEG 120 

orf 138. pep LLF 123 

20 Ml 

orf 138ng LLFITPHIGSYDLGGRYISQQLPFHLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTGIQG 180 

The complete length PRF138ng nucleotide sequence [<SEQ ID 57 1>] (SEPIDNP: 571) is: 

1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA 

25 51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGCTG TCGCTTTCCT 

101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA 

151 AAGGAAGACC GCGCGCGCAT CGTCGCCAAT ATGCGGCAGG CGGGTTTGAA 

201 CCCCGACACG CAGACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAATGCG 

251 GTTTGGAACT TGCCCCCGCG TTTTTCAAAA AACCGGAAGA CATCGAAACA 

30 301 ATGTTCAAAG CGGTACACGG CTGGGAACAC GTGCAGCAGG CTTTGGACAA 

3 51 GGGCGAAGGG CTGCTGTTCA TCACGCCGCA CATCGGCAGC TACGATTTGG 

4 01 GCGGACGCTA CATCAGCCAG CAGCTTCCGT TCCACCTGAC CGCCATGTAC 
4 51 AAGCCGCCGA AAATCAAAGC GATAGACAAA ATCATGCAGG CGGGCAGGGT 
501 GCGCGGCAAA GGCAAAACcg cgcccaccgg catACAAGGG GTCAAACAAA 

35 551 tcatcaAGGC CCTGCGCGCG GGCGAGGCAA CCAtcATCCT GCCCGACCAC 

601 GTCCCTTCTC CGCAGGAagg cggCGGCGTG TGGGCGGATT TTTTCGGCAA 

651 ACCTGCATAc acCATGACAC TGGCGGCAAA ATTGGCACAC GTCAAAGGCG 

701 TGAAAACCCT GTTTTTCTGC TGCGAACGCC TGCCCGACGG ACAAGGCTTC 

751 GTGTTGCACA TCCGCCCCGT CCAAGGGGAA TTGAACGGCA ACAAAGCCCA 

40 801 CGATGCCGCC GTGTTCAACC GCAATACCGA ATATTGGATA CGCCGTTTTC 

851 CGACGCAGTA TCTGTTTATG TACAACCGCT ATAAAACGCC GTAA 



This encodes a protein having amino acid sequence (SEP ID NO: 572) : 
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1 MFRLQFRLFP PLRTAMH ILL TALLKCLSLL SLSC LHTLGN . RLGHLAFYLL 

51 KEDRARIVAN MRQAGLNPDT QTVKAVFAET AKCGLELAPA FFKKPEDIET 

101 MFKAVHGWEH VQQALDKGEG LLFITPHIGS YDLGGRYISQ QLPFHLTAMY 

151 KPPKIKAIDK IMQAGRVRGK GKTAPTGIQG VKQIIKALRA GEATIILPDH 

5 201 VPSPQEGGGV WADFFGKPAY TMTLAAKLAH VKGVKTLFFC CERLPDGQGF 

251 VLHIRPVQGE LNGNKAHDAA VFNRNTEYWI RRFPTQYLFM YNRYKTP* 

ORF138ng (SEP ID NO: 572) and ORF138-1 fSEO ID NO: 568) show 94.3% identity over 299aa 
overlap: 

1 0 orf 13 8 - 1 . pep MFRLQFRLFPPLRTAMH I LLTALLKCLSLLPLSCLHTLGNRLGHLAFYLL KEDRARIVAN 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 13 8ng MFRLQFRLFPPLRTAMHILLTALLKCLSLLSLSCLHTLGNRLGHLAFYLLKEDRARIVAN 

orf 13 8 - 1 . pep MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG 

Illllllll ^ I I I ! I I I [ I I 1 I I I I I I I I hi I I I I I I I II I I I I I I I I I i I I I || 
15 orf 13 8ng MRQAGLNPDTQTVKAVFAETAKCGLELAPAFFKKPEDIETMFKAVHGWEHVQQALDKGEG 

orf 138-1 .pep LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG 

IIIMIIMI II II MM II I III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 h 1 1 1 

orf 13 8ng LLFITPHIGSYDLGGRYISQQLPFHLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTGIQG 

orf 13 8 - 1 . pep VKQI I KALRSGEATIVLPDHVPSPQEGGEGVWVDFFGKPAYTMTLAAKLAHVKGVKTLFF 

20 I I I I I I I I I : I I I I I : M I I I I M I I I I I I h I I M I I I I I I I I I I I I I I II I I I I I I I 

or f 1 3 8 ng VKQ 1 1 KALRAGEAT 1 1 LPDHVPSPQEGG - GVWADFFGKPAYTMTLAAKLAHVKGVKTLFF 

orf 13 8 - 1 . pep CCERLPGGQGFDLH I RPVQGELNGDKAHDAAVFNRNAEYW I RRFPTQYLFM YNRYKMP 

IIMII I I I I I INI II II llh II II II II llh II II II II II II II II II I I 
or f 13 8ng CCERLPDGQGFVLH I RPVQGELNGNKAHDAAVFNRNTEYW I RRFPTQYLFM YNRYKTP 

25 

In addition, ORF138ng (SEP ID NO: 572) is homologous to htrB protein (SEP ID NO: 1 147) 
from Pseudomonas fluorescens: 

gnl | PID| e334283 (Y14568) htrB [Pseudomonas fluorescens] Length = 253 
Score = 80.8 bits (196), Expect = 9e-15 
30 Identities = 49/151 (32%), Positives = 79/151 (51%), Gaps = 6/151 (3%) 

MFKAVHGWEHVQQALDKGEGLLFITPHIGSYD-LGGRYISQQLPFHLTAMYKPPKIKAID 159 
+ + V G E + + +AL G+G++ IT H+G+ + + L Y SQ P Y-f PPK+KA+D 

LVREVEGLEVLKEALASGKGWGITSHLGNWEVLNHFYCSQCKPI 1 FYRPPKLKAVD 150 

KIMQAGRVRGKGKTAPTGIQGVKQIIKALRAGEATIILPDHVPSPQEGGGVWADFFGKPA 219 
35 + + + + RV+ K A + +G+ +IK +R G I D P P E G++ FF A 



40 



Query : 


101 


Sbjct : 


94 


Query: 


160 


Sbjct: 


151 


Query: 


220 


Sbjct: 


209 



+F RLPDG G+ 



Based on this analysis, including the presence of a putative transmembrane domain in the 
gonococcal protein, it was predicted that the proteins from N. meningitidis and N. gonorrhoeae , and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



CHIR-0160 (356.001) 



-425- 



PATENT 



ORF138-1 (SEP ID NO: 568) (57kDa) was cloned in the pGex vectors and expressed in E.colU as 
described above. The products of protein expression and purification were analyzed by SDS- 
PAGE. Figure 14A shows the results of affinity purification of the GST-fusion protein. Purified 
GST-fusion protein was used to immunise mice, whose sera were used for ELISA (positive result) 
and FACS analysis (Figure 14B). These experiments confirm that ORF138-1 (SEP ID NO: 568) is 
a surface-exposed protein, and that it is a useful immunogen. 

Example 69 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 573>] (SEP ID 
NP: 573) : 

1 . . GCGTGGTCGG CCGGCGAATC GTGGCGTGTG TTAATGGAAA GTGAAACGTG 

51 GCATGCGGTG TGGAATACTT TGCGCTTCTC GGCGGCGGCG GTGTATGCGG 

101 CAGCGGTTTT GGGTGTGGTG TATGCGGCGC CGGCGCGGCG GTCGGCGTGG 

151 ATGCGCGGGC TGATGTTTTA GCCGTTTATG GTGTCGCCGG TTTGTGTTTC 

201 GGCGGGCGTG CTGCTGCTTT ATCCGCAGTG GACGGCTTCG TTGCCGTTGC 

251 TGCTGGCGAT GTATGCGCTG CTGGCGTATC CGTTTGTGGC AAAAGATGTT 

301 TTATCAGCCT GGGATGCACT GCCGCCGGAT TACGGCAGGG CGGCGGCGGG 

351 TTTGGGTGCA AACGGCTTTC AGACGGCATG CCGCATCACG TTCCCCCTCT 

4 01 TGAAACCGGC GTTGCGGCGC GGTCTGACTT TGGCGGCGGC AACCTGCGTG 

451 GGCGAATTTG CGGCGACATT GTTTCTGTCG CGTCCGGAAT GGCAGACGCT 

501 GACGACTTTG ATTTATGCCT ATTTGGGACG CGCGGGTGAG GATAATTACG 

551 CGCGGGCGAT GGTGCTG. . 

This corresponds to the amino acid sequence [<SEQ ID 574; PRF139>] (SEP ID NP: 574; 
PRF139) : 

1 . . AWSAGESWRV LMESETWHAV WNTLRFSAAA VYAAAVLGW YAAPARRSAW 
51 MRGLMFXPFM VSPVCVSAGV LLLYPQWTAS LPLLLAMYAL LAYPFVAKDV 
101 LSAWDALPPD YGRAAAGLGA NGFQTACRIT FPLLKPALRR GLTLAAATCV 
151 GEFAATLFLS RPEWQTLTTL IYAYLGRAGE DNYARAMVL . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 575>] (SEP ID NP: 575) : 

1 ATGGATGGAC GGCGTTGGGT GGTATGGGGT GCTTTTGCCC TGCTGCCTTC 

51 GGCTTTTTTG GCGGTAATGG TCGTTGCGCC TTTGTGGGCG GTGGCGGCGT 

101 ATGACGGTTT GGCGTGGCGC GCGGTGCTGT CGGATGCCTA TATGCTCAAA 

151 CGTTTGGCGT GGACGGTATT TCAGGCAGCG GCAACCTGTG TGCTGGTGCT 

2 01 GCCTTTGGGC GTGCCTGTCG CGTGGGTGCT GGCGCGGCTG GCGTTTCCGG 

2 51 GGCGGGCTTT GGTGCTGCGC CTGCTGATGC TGCCTTTTGT GATGCCCACG 
301 TTGGTGGCGG GCGTGGGCGT GCTGGCCCTG TTCGGGGCGG ACGGGCTGTT 

3 51 GTGGCGCGGC AGGCAGGATA . CGCCGTATCT GTTGTTGTAC GGCAATGTGT 
401 TTTTCAACCT TCCTGTGTTG GTCAGGGCGG CGTATCAGGG GTTTGTGCAA 

4 51 GTGCCTGCGG CACGGCTTCA GACGGCACGG ACGTTGGGCG CGGGGGCGTG 
501 GCGGCGGTTT TGGGACATTG AAATGCCCGT TTTGCGCCCG TGGCTTGCCG 
551 GCGGCGTGTG CCTTGTCTTT CTGTATTGTT TTTCCGGGTT CGGGCTGGCG 
601 CTGCTGCTGG GCGGCAGCCG TTATGCCACG GTCGAAGTGG AAATTTACCA 
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651 GTTGGTCATG TTCGAACTCG ATATGGCGGT TGCTTCGGTG CTGGTGTGGC 

701 TGGTGTTGGG GGTAACGGCG GCGGCAGGGT TGCTGTATGC GTGGTTCGGC 

751 AGGCGCGCGG TTTCGGATAA GGCGGTTTCC CCTGTGATGC CGTCGCCGCC 

801 GCAGTCGGTC GGGGAATATG TGCTGCTGGC GTTTGCGGCG GCGGTGTTGT 

851 CTGTGTGCTG CCTGTTTCCT TTGTTGGCAA TTGTTGTGAA AGCGTGGTCG 

901 GCCGGCGAAT CGTGGCGTGT GTTAATGGAA AGTGAAACGT GGCAGGCGGT 

951 GTGGAATACT TTGCGCTTCT CGGCGGCGGC GGTGTATGCG GCGGCGGTTT 

1001 TGGGTGTGGT GTATGCGGCG GCGGCGCGGC GGTCGGCGTG GATGCGCGGG 

1051 CTGATGTTTT TGCCGTTTAT GGTGTCGCCG GTTTGTGTTT CGGCGGGCGT 

1101 GCTGCTGCTT TATCCGCAGT GGACGGCTTC GTTGCCGTTG CTGCTGGCGA 

1151 TGTATGCGCT GCTGGCGTAT CCGTTTGTGG CAAAAGATGT TTTATCAGCC 

12 01 TGGGATGCAC TGCCGCCGGA TTACGGCAGG GCGGCGGCGG GTTTGGGTGC 

12 51 AAACGGCTTT CAGACGGCAT GCCGCATCAC GTTCCCCCTC TTGAAACCGG 
1301 CGTTGCGGCG CGGTCTGACT TTGGCGGCGG CAACCTGCGT GGGCGAATTT 

13 51 GCGGCGACAT TGTTTCTGTC GCGTCCGGAA TGGCAGACGC TGACGACTTT 

14 01 GATTTATGCC TATTTGGGAC GCGCGGGTGA GGATAATTAC GCGCGGGCGA 
1451 TGGTGCTGAC ATTGCTGTTG GCGGCGTTCG CGCTGGGTAT TTTCCTGCTG 
1501 TTGGACGGCG GCGAAGGCGG AAAACAGACG GAAACGTTAT AA 

This corresponds to the amino acid sequence [<SEQ ID 576; ORF139-l>] (SEP ID NO: 576; 
ORF139-1) : 

1 MDGRRWWWG AFALLPSAFL AVMWAPLWA VAAYDGLAWR AVLSDAYMLK 

51 RLAWTVFQAA ATCVLVLPLG VPVAWV LARL AFPGRALVLR LLML PFVMPT 

101 LVAGVGVLAL FGA DGLLWRG RQDTPYLLLY GNVFFNLPVL VRAAYQGFVQ 

151 VPAARLQTAR TLGAGAWRRF WDIEMPVLRP WLAGG VCLVF LYCFSGFGLA 

201 LLLGGSRYAT VEVEIYQLVM FELDMAVA SV LVWLVLGVTA AAGLL YAWFG 

251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFA A AVLSVCCLFP LLAIW KAWS 

3 01 AGESWRVLME SETWQAVWNT LRFS AAAVYA AAVLGWYAA AA RRSAWMRG 
351 LMFLPFMVSP VCVSAGVLLL YPQWTASLPL LLAMYALLAY PFVAKDVLSA 
401 WDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF 

4 51 AATLFLSRPE WQTLTTLIYA YLGRAGEDNY ARA MVLTLLL AAFALGIFLL 
501 LDGGEGGKQT ETL* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF139 (SEP ID NO: 574) shows 94.7% identity over a 189aa overlap with an ORF (ORF139a) 
(SEO ID NO: 578) from strain A of N. meningitidis: 

10 20 30 

orf 13 9 . pep AWSAGESWRVLMESETWHAVWNTLRFSAAA 

Illlllllllllllllhlllll I I I I I I 
orf 13 9a QSVGEYVLLAFA AAVXSVCCLFXLLAIW KAWSAGESWRVLMESETWQAVWNTXRFS AAA 

v 270 280 290 300 310 . 320 

40 50 60 70 80 90 

orf 13 9 . pep VYAAAVLGWYAAPARRSAWMRGLMFXPFMVSPVCVSAGVLLLYPQWTASLPLLLAMYAL 

. I I I I I I I I I I I I M I I I I I I I I I II II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
orf 13 9a VYAAAVLGWYAAAARRSAWMRGLMFLPFMVSPVCVSAGVLLLXPQWTASLPLLLANYAL 

, 330 340 350 360 370 380 

100 110 120 130 140 150 
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orf 13 9 . pep LAY P FVA KDVLS AWDAL P PDYGRAAAGLGANGFQTACR I TF PLLKPALRRGLTLAAATCV 

! 1 1 1 1 1 M 1 1 1 1 1 MMMIIIIIIII Mil IIIMIIIIIMIIIIIMIMIIIIII 

orf 139a LAYPFVA KDVLSAXDALPPDYGRAAAGLGANGFQTACRITFPLLKPALRRGLTLAAATCV 
390 400 410* 420 430 440 

160 170 180 189 

orf 13 9 . pep GEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYARAMVL 

MINI! II llllllllllll II 1 1 lllllllll 
orf 13 9a gefaatlfxsrxewqtlttliyayxgragxdnyaram vltlllaafalgxflll dggegg 

450 460 470 480 490 500 

The complete length ORF139a nucleotide sequence [<SEQ ED 577>] (SEP ID NO: 577) is 

1 atggatggac ggcgttgggc ggtatggggt gcttttgccc tgctgccttc 

51 ggcttttttg gcggcaatgg tcgttgcgcc tttgtgggcg gtggcggcgt 

101 atgacggttt ggcgtggcgc gcggtgctgt cggatgccta tatgctcaaa 

151 cgtttggcgt ggacggtatt tcaggcagcg gcaacctgtg tgctggtgct 

201 gcctttgggc gtgcctgtcg cgtgggtgct ggcgcggctg gcgtttccgg 

251 GGCGGGCTTT GGTGCTGCGC CTGCTGATGC TGCCTTTTGT GATGCCCACG 

301 TTGGTGGCGG GCGTGGGCGT GCTGGCTCTG TTCGGGGCGG ACGGCCTGTN 

351 GTGGCGCGGC TGGCAGGATA CGCCGTATCT GTTGTTGTAC GGCAATGTGT 

4 0,1 TTTTTNACCT TCCTGTGTTG GTCAGGGCGG CATATCAGGG GTTTGTGCAA 

4 51 GTGCCTGCGG CACGGCTTCA GACGGCACNG ACATTGGGCG CGGGGGCGTG 

501 GCGGCGGTTT TGGGACATTG AAATGCCCGT TTTGCGCCCG TGGCTTGCCG 

.551 GCGGCGTGTG CCTTGTCTTC CTGTATTGTT TTTCGGGGTT CGGGCTGGCA 

601 TTGCTGCTGG GCGGCAGCCG TTATGCCACG GTCGAAGTGG AAATTTACCA 

651 GTTGGTCATG TTCGAACTCG ATATGGCGGT TGCTTCGGTG CTNGTGTGGC 

701 TGGTGTNGGG GGTAACNGCG GCGGCAGGGT TGCTGTATGC GTGGTTCGGC 

751 AGGCGCGCGG TTTCGGATAA GGCNGTTTCC CCTGTGATGC CGTCGCCGCC 

801 GCAGTCGGTC GGGGAATATG TGCTNCTGGC GTTTGCGGCG GCGGTGTNGT 

851 CTGTGTGCTG CCTGTTTCNT TTGTTGGCAA TTGTTGTGAA AGCGTGGTCG 

901 GCCGGCGAAT CGTGGCGTGT GTTAATGGAA AGTGAAACGT GGCAGGCGGT 

951 GTGGAATACT NTGCGCTTCT CGGCGGCGGC GGTGTATGCG GCGGCGGTTT 

1001 TGGGTGTGGT GTATGCGGCG GCGGCGCGGC GGTCGGCGTG GATGCGCGGG 

1051 CTGATGTTTT TGCCGTTTAT GGTGTCGCCG GTTTGTGTTT CGGCGGGCGT 

1101 GCTGCTGCTT NATCCGCAGT GGACGGCTTC GTTGCCGCTG CTGCTGGCGA 

1151 TGTATGCGCT GCTGGCGTAT CCGTTTGTGG CAAAAGATGT TTTATCAGCC 

1201 TGNGATGCAC TGCCGCCGGA TTACGGCAGG GCGGCGGCGG GTTTGGGTGC 

12 51 AAACGGCTTT CAGACGGCAT GCCGCATCAC GTTCCCCCTC TTGAAACCGG 

1301 CGTTGCGGCG CGGTCTGACT TTGGCGGCGG CAACCTGCGT GGGCGAATTT 

1351 GCGGCAACCT TGTTCNTGTC GCGTCNCGAG TGGCAGACGC TGACGACTTT 

14 01 GATTTATGCC TATNTGGGAC GCGCGGGTGA NGATAATTAC GCGCGGGCGA 

14 51 TGGTGCTGAC ATTGCTGTTG GCGGCGTTCG CGCTGGGTAT NTTCCTGCTG 

1501 TTGGACGGCG GCGAAGGCGG AAAACGGACG GAAACGTTAT AA 

This encodes a protein having amino acid sequence [<SEQ ID 578>] (SEP ID NO: 578) : 



1 MDGRRWAVWG afallpsafl aamwaplwa vaaydglawr AVLSDAYMLK 

51 rlawtvfqaa atcvlvlplg vpvawv larl afpgralvlr llml pfvmpt 

101 LVAGVGVLAL FGAD GLXWRG WQDTPYLLLY GNVFFXLPVL VRAAYQGFVQ 

151 VPAARLQTAX TLGAGAWRRF WDIEMPVLRP WLAGG VCLVF LYCFSGFGLA 

201 LLLGGSRYAT VEVEIYQLVM FELDMAV ASV LVWLVXGVTA AAGLL YAWFG 

251 RRAVSDKAVS PVMPSPPQSV GEYVLLAF AA AVXSVCCLFX LLAIW KAWS 

301 AGESWRVLME SETWQAVWNT XRFS AAAVYA AAVLGWYAA AA RRSAWMRG 

351 LMFLPFMVSP VCVSAGVLLL XPQWTASLPL LLAMYALLAY P FVAKDVLS A 

401 XDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF 

4 51 AATLFXSRXE WQTLTTLIYA YXGRAGXDNY ARAM VLTLLL AAFALGXFLL 

501 LDGGEGGKRT ETL* 
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ORF139a (SEP ID NO: 578) and ORF139-1 fSEO ID NO: 576) show 96.5% homology over a 
5 14aa overlap: 

or f 1 3 9a . pep MDGRRWAVWGAFALLPSAPLAAMWAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAA 

I I I 1 M : 1 I : I I I I t I I I I I I I 1 I I i 1 I I 1 i I I I I I I I I I I 1 

5 orfl39-l MDGRRWVyWGAFALLPSAFLAVMWAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAA 

or f 1 3 9a . pep ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLXWRG 

III MIMIIMI II II MM I II IIMIMIIII I MM MM II I Ml Mill III 

or f 1 3 9 - 1 ATCVLVLPLGyPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLLWRG 
orf 139a. pep WQDTPYLLLYGNVFFXLPVLVRAAYQGFVQVPAARLQTAXTLGAGAWRRFWDIEMPVLRP 

10 IIIIIIIIIIIIM II IIIIMIMIIIMIIIMI! Mill llllll IIIIIIMI 

orf 13 9-1 RQDTPYLLLYGNVFFNLPVLVRAAYQGFVQVPAARLQTARTLGAGAWRRFWDIEMPVLRP 
or f 13 9a . pep WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEI YQLVMFELDMAVASVLVWLVXGVTA 

; i 1 1 1 1 1 1 1 j i 1 1 1 1 1 1 1 1 i i ill i [ 1 1 1 1 1 1 1 

or f 1 3 9 - 1 WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEI YQLVMFELDMAVASVLVWLVLGVTA 

15 orf 139a .pep AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFAAAVXSVCCLFXLLAIWKAWS 

III II IIIIIIMI 1 1 II I MM 1 1 IIIIIIIIIIIIM I II MUM I II II II III 

orf 13 9-1 AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFAAAVLSVCCLFPLLAIWKAWS 
orf 13 9a . pep AGESWRVLMESETWQAVWNTXRFSAAAVYAAAVLGWYAAAARRSAWMRGLMFLPFMVSP 

II I II Mill MM Mill I 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 II 

20 or f 13 9 - 1 AGESWRVLMESETWQAVWNTLRFSAAAVYAAAVLGWYAAAARRSAWMRGLMFLPFMVSP 

orf 13 9a . pep VCVSAGVLLLXPQWTASLPLLLAMYALLAYPFVAKDVLSAXDALPPDYGRAAAGLGANGF 

I I I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II II I I I I I 
orf 139-1 VCVSAGVLLLYPQWTASLPLLLAMYALLAYPFVAKDVLSAWDALPPDYGRAAAGLGANGF 

orf 13 9a. pep QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFXSRXEWQTLTTLIYAYXGRAGXDNY 

25 | | | | | | || || || | | || II II II II I I I I I I I I I I I I I II II I II I I I I I I I I I I I I 

orf 13 9- 1 QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNY 

orf 13 9a . pep ARAMVLTLLLAAFALGXFLLLDGGEGGKRTETLX 

1 1 1 1 1 1 ; 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 hi 1 1 1 1 

orf 13 9- 1 ARAMVLTLLLAAFALGIFLLLDGGEGGKQTETLX 

30 Homology with a predicted ORF from N. gonorrhoeae 

ORF139 (SEP ID NO: 574) shows 95.2% identity over a 189aa overlap with a predicted ORF 
(ORF1 39ng) (SEP ID NO: 580) from N. gonorrhoeae: 



orf 139 .pep AWSAGESWRVLMESETWHAVWNTLRFSAAA 30 

llllll ! I I I I I I I M I I I II I ' I I I I 

35 orf 13 9ng QSVGEYVLLAFSVAVLSVCCLFPLSAIWKAWSAGESRRVLMESETWQAVWNTLRFSAAA 327 

orf 13 9 . pep WAAAVLGVVYAAPARRSAWMRGLMFXPFMVSPVCVSAGVLLLYPQWTASLPLLLAMYAL 90 

Mill ill II I Ml MM I hi IIIIIMIIMIIIIMI II MM MM MM 

orf 13 9ng VFAAAVLGVVYAAAARRLVWMRGLVFLPFMVSPVCVSAGVLLLYPGWTASLPLLLAMYAL 387 
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orfl39.pep LAYPFVAKDVLSAWDALPPDYGRAAAGLGANGFQTACRITFPLLKPALRRGLTLAAATCV 150 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 E 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 j 1 1 ! 1 1 1 

or f 1 3 9ng LAYP FVAKDVLSAWDALPPDYGRAAAGLGANGFQTACRI TFPLLKPALRRGLTLAAATCV 44 7 

orf 139 .pep GEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYARAMVL 189 

1 1 1 Ml M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 

orf 139ng GEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYARAMVLTLLLSAFAVCIFLLLDNGEGG 507 

The complete length ORF139ng nucleotide sequence [<SEQ ID 579>] (SEP ID NO: 579) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 580>] (SEP ID NO: 580) : 



1 MDGRCWAVRG AFSLLPSAFL AVMWAPLWA VAAYDGLAWR AVLSDAYMLK 

51 RLAWTVFQAA ATCVLVLPLG VPVAWVLARL AFPGRALVLR LLMLPFVMPT 

101 LVAGVGVLAL FGADGLLWRG RQDTPYLLLY GNVFFNLPVL VRAAYQG FAQ 

151 VPAARLQTAR TLGAGAWRPF WDIEMPV LRP WLAGGVCLVF LYCFSGFGLA 

201 LLLGGSRYAT VEVEIYQLVM FELDMAGASA LVWLVLGVTA AAGLLYAWFG 

251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFSV AVLSVCCLFP LSAIWKAWS 

301 AGESRRVLME SETWQAVWNT LRFSAAAVFA AAVLGWYAA AARRLVWMRG 

351 LVFLPFMVSP VCVSAGVLLL YPGWTASLPL LLAMYALLAY PFVAKDVLSA 

401 WDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF 

451 AATLFLSRPE WQTLTTLIYA YLGRAGEDNY ARAMVLTLLL SAFAVCIFLL 

501 LDNGEGGKRT ETL* 



Further work revealed a variant gonococcal DNA sequence [<SEQ ID 581 >] (SEP ID NO: 581) : 

1 ATGGATGGAC GGTGTTGGGC GGTACGGGGT GCTTTTTCCC TGCTGCCTTC 

51 GGCTTTTTTG GCGGTAATGG TCGTTGCGCC TTTGTGGGCG GTGGCGGCGT 

101 ATGACGGTTT GGCGTGGCGC GCGGTGCTGT CGGATGCCTA TATGCTCAAA 

151 CGTTTGGCGT GGACGGTGTT TCAGGCGGCG GCAACCTGTG TGCTGGTGCT 

2 01 GCCTTTGGGC GTGCCTGTCG CGTGGGTGCT GGCGCGGCTG GCGTTCCCGG 
251 GGCGGGCTTT GGTGCTGCGC CTGCTGATGC TGCCGTTTGT GATGCCCACG 

3 01 CTGGTGGCGG GCGTGGGCGT GCTGGCTCTG TTCGGGGCGG ACGGGCTGTT 
351 GTGGCGCGGC CGGCAGGATA CGCCGTATCT GTTGTTGTAC GGCAATGTGT 
401 TTTTCAACCT GCCCGTGTTG GTCAGGGCGG CGTATCAGGG GTTTGCTCAA 
451 GTGCCTGCGG CACGGCTTCA GACGGCACGG ACGTTGGGCG CGGGGGCGTG 
501 GCGGCGGTTT TGGGACATTG AAATGCCCGT TTTGCGCCCG TGGCTTGCCG 
551 GCGGCGTGTG CCTTGTCTTC CTGTATTGTT TTTCGGGGTT CGGGCTGGCA 
601 TTGCTGTTGG GCGGCAGCCG TTATGCCACG GTCGAAGTGG AAATTTACCA 
651 GTTGGTTATG TTCGAACTCG ATATGGCGGG GGCTTCGGCG CTGGTGTGGC 
701 TGGTGTTGGG GGTAACGGCG GCGGCAGGGT TGCTGTATGC GTGGTTCGGC 
751 AGGCGCGCGG TTTCGGATAA GGCGGTTTCC CCCGTGATGC CGTCGCCGCC 
801 GCAATCGGTG GGGGAATATG TATTGCTGGC ATTTTCGGTG GCGGTGTTGT 
851 CCGTGTGCTG CCTGTTTCCT TTGTCGGCAA TTGTTGTGAA AGCGTGGTCG 
901 GCCGGCGAAT CGCGGCGTGT GTTAATGGAA AGTGAAACGT GGCAGGCAGT 
951 GTGGAATACt ttGCGCTTTT CGGCGGCGGC GGTGTTTGCG GCGGCGGTTT 

1001 TGGGTGTGGT GTATGCGGCG GCGGCGCGGC GGCTGGTGTG GATGCGCGGA 

1051 CTGGTGTTTT TACCGTTTAT GGTGTCGCCG GTTTGTGTTT CGGCGGGCGT 

1101 GCTGCTGCTT TATCCGGGGT GGACGGCTTC GTTACCGCTG CTGCTGGCGA 

1151 TGTATGCGCT GCTGGCGTAT CCGTTTGTGG CAAAAGATGT TTTATCGGCC 

1201 TGGGATGCAC TGCCGCCGGA TTACGGCAGG GCGGCGGCAG GTTTGGGCGC 

1251 AAACGGCTTT CAGACGGCAT GCCGTATCAC GTTCCCCCTC TTGAAACCGG 

13 01 CGTTGCGGCG CGGTCTGACT TTGGCGGCGG CGACGTGTGT GGGCGAATTT 
1351 GCGGCAACCT TGTTCCTGTC GCGTCCGGAA TGGCAGACGT TGACGACTTT 

14 01 GATTTATGCC TATTTGGGGC GTGCGGGTGA GGACAATTAT GCGCGGGCAA 
14 51 TGGTGTTGAC ATTGCTGTTG TCGGCATTTG CGGTGTGCAT TTTCCTGCTG 
1501 TTGGACAACG GCGAAGGCGg aaaACGGACG GAAACGTTAT AA 
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This corresponds to the amino acid sequence [<SEQ ID 582; ORF139ng-l>] (SEP ID NO: 582; 
PRF139ng-l) : 

1 MDGRCWAVRG AFSLLPSAFL AVMWAPLWA VAAYDGLAWR AVLSDAYMLK 

51 RLAWTVFQAA ATCVLVLPLG VPVAWVLA RL AFPGRALVLR LLMLPFVMPT 

5 101 LVAGVGVLAL FGA DGLLWRG RQDTPYLLLY GNVFFNLPVL VRAAYQGFAQ 

151 VPAARLQTAR TLGAGAWRRF WDIEMPVLRP WLAGG VCLVF LYCFSGFGLA 

201 LLLGGSRYAT VEVEIYQLVM FELDMAG ASA LVWLVLGVTA AAGLL YAWFG 

251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFS V AVLSVCCLFP LSAIW KAWS 

3 01 AGESRRVLME SETWQAVWNT LRFS AAAVFA AAVLGWYAA AA RRLVWMRG 

10 351 LVFLPFMVSP VCVSAGVLLL YPGWTASLPL LLAMYALLAY PFVAKDVLSA 

401 WDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF 

451 AATLFLSRPE WQTLTTLIYA YLGRAGEDNY ARAM VLTLLL SAFAVCIFLL 

501 LDNGEGGKRT ETL* 

15 ORF139ng-l (SEP ID NO: 582) and ORF139-1 (SEP ID NO: 576) show 95.9% identity over 
513aa overlap: 

or f 1 3 9ng MDGRCWAWGAFSLLPSAFLAVMWAPLWAVAAYDGIAWRAVLSDAYMLKRLAWTVFQAA 

1 1 1 1 hi Ihllllll IIIIIIIIM IIIIIIMMillllllll llllllll 

or f 1 3 9 - 1 MDGRRWWWGAFALLPS AFLAVMWAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAA 

20 orf 13 9ng ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLLWRG 

1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 13 9-1 ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLLWRG 

orf 13 9ng RQDTPYLLLYGNVFFNLPVLVRAAYQGFAQVPAARLQTARTLGAGAWRRFWDIEMPVLRP 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I = I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

25 orf 13 9-1 RQDTPYLLLYGNVFFNLPVLVRAAYQGFVQVPAARLQTARTLGAGAWRRFWDIEMPVLRP 

orf 13 9ng WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVE I YQLVMFELDMAGASALVWLVLGVTA 

- 1 1 ! I M i 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M M I M 1 1 IhllMIMIII 

orf 139-1 WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEIYQLVMFELDMAVASVLVWLVLGVTA 
or f 13 9ng AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFSVAVLSVCCLFPLSAI WKAWS 

30 || 1 1 1| I II I II II I II 1 1 1 1 1 1 1 1 1 1 i II 1 1 1 M II h : 1 1 II M I II 1 1 II II I II I 

orf 13 9-1 AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFAAAVLSVCCLFPLLAIWKAWS 
orf 1 3 9ng AGESRRVLMESETWQAVWNTLRFSAAAVFAAAVLGVVYAAAARRLVWMRGLVFLPFMVSP 

MM IMIIIIIMIIIIIIIIIIMhIIIIIIIIMIMM Mlllhllllllll 

orf 13 9 AGESWRVLMESETWQAVWNTLRFSAAAVYAAAVLGWYAAAARRSAWMRGLMFLPFMVSP 
35 orf 13 9ng VCVSAGVLLL YPGWTASLPLLLAMYALLAYPFVAKDVLSAWDALPPDYGRAAAGLGANGF 

Illlllllllll 1 1 1 II I M 1 1 1 1 1 1 II 1 1 1 1 II II 1 1 1 1 1 II I II I II M 1 1 1 1 1 II I 

orf 13 9-1 VCVSAGVLLL YPQWTASLPLLLAMYALLAYPFVAKDVLSAWDALPPDYGRAAAGLGANGF 

orf 13 9ng QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNY 

I I I I II II II I I I II M II I II I I I I I I II I I I I II I II II I M I I I I II II I I I II M I 
40 orf 13 9- 1 QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNY 

orf 13 9ng ARAMVLTLLLS AFAVC I FLLLDNGEGGKRTETL 

I I I I I I I M h I I h I I i I I I = t I M I = 1 I I I 
or f 1 3 9 - 1 ARAMVLTLLLAAFALGI FLLLDGGEGGKQTETL 
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Based on the presence of a predicted binding-protein-dependent transport systems inner membrane 
component signature (underlined) in the gonococcal protein, it is predicted that the proteins from 
N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or 
diagnostics, or for raising antibodies. 

Example 70 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 583>] (SEP ID 
NO: 583) : 

1 ATGGACGGCT GGACACAGAC GCTGTCCGCG CAAACCCTGT TGGGCATTTC 

51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAGA TTCCGCATCC 

101 ACGCGCTGCT GACACTGGTC ATCGTCAGCC TGCTGACGGC TTTGGCAACC 

151 GGTTTGCCCA CAGGCAGCAT TGTCAAAGAC ATACTGGTCA AAAACTTCGG 

2 01 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGCCTGGGC GCGATGCTCG 

251 AACGTTTGGT C. . . 

This corresponds to the amino acid sequence [<SEQ ID 584; ORF140>] (SEP ID NO: 584; 
ORF140) : 

1 MDGWTQTLSA QTLLGISAAA IILILILIVR FRIHALLTLV IVSLLTALAT 

51 GLPTGSIVKD ILVKNFGGTL GGVALLVGLG AMLERLV. . 

Further work revealed the complete nucleotide sequence [<SEQ ID 585>] (SEP ID NO: 585) : 

1 ATGGACGGCT GGACACAGAC GCTGTCCGCG CAAACCCTGT TGGGCATTTC 

51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAAA TTCCGCATCC 

101 ACGCGCTGCT GACACTGGTC ATCGTCAGCC TGCTGACGGC TTTGGCAACC 

151 GGTTTGCCCA CAGGCAGCAT TGTCAACGAC ATACTGGTCA AAAACTTCGG 

2 01 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGCCTGGGC GCGATGCTCG 

2 51 GACGTTTGGT CGAAACATCC GGCGGCGCAC AGTCGCTGGC GGACGCGCTG 

3 01 ATCCGGATGT TCGGCGAAAA ACGCGCACCG TTCGCGCTGG GCGTTGCCTC 
351 GCTGATTTTC GGCTTCCCGA TTTTCTTCGA TGCCGGACTA ATCGTCATGC 

4 01 TGCCCATCGT GTTCGCCACC GCACGGCGCA TGAAACAGGA CGTACTGCCC 
4 51 TTCGCGCTTG CCTCCATCGG CGCATTTTCC GTCATGCACG TCTTCCTGCC 
501 GCCCCATCCG GGCCCGATTG CCGCTTCCGA ATTTTACGGC GCGAACATCG 
551 GCCAAGTTTT GATTTTGGGT CTGCCGACCG CCTTCATCAC ATGGTATTTC 
601 AGCGGCTATA TGCTCGGCAA AGTGTTGGGG CGCACCATCC ATGTTCCCGT 
651 TCCCGAACTG CTCAGCGGCG GCACGCAAGA CAACGACCTG CCGAAAGAAC 
701 CTGCCAAAGC AGGAACGGTC GTCGCCATCA TGCTGATTCC CATGCTGCTG 
751 ATTTTCCTGA ATACCGGCGT ATCGGCCCTC ATCAGCGAAA AACTCGTAAG 
801 TGCGGACGAA ACCTGGGTTC AGACGGCAAA AATAATCGGT TCGACACCGA 
851 TCGCCCTTCT GATTTCCGTA TTGGTCGCAC TGTTTGTCTT GGGACGCAAA 
901 CGCGGCGAAA GCGGCAGCGC GTTGGAAAAA ACCGTGGACG GCGCACTCGC 
951 CCCCGTCTGT TCCGTGATTC TGATTACCGG CGCGGGCGGT ATGTTCGGCG 

1001 GCGTTTTGCG CGCTTCCGGC ATCGGCAAGG CACTCGCCGA CAGCATGGCG 

1051 GATTTGGGCA TTCCCGTCCT TTTGGGCTGT TTCCTTGTCG CCTTGGCACT 

1101 GCGTATCGCG CAAGGTTCGG CAACCGTCGC CCTGACCACC GCCGCCGCGC 

1151 TGATGGCTCC TGCCGTTGCC GCCGCCGGCT TTACCGACTG GCAGCTCGCC 

1201 TGTATCGTAT TGGCAACGGC GGCAGGTTCG GTCGGTTGCA GCCACTTCAA 
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1251 CGACTCCGGC TTCTGGCTGG TCGGCCGTCT CTTGGACATG GACGTACCGA 
13 01 CCACGCTGAA AACCTGGACG GTCAACCAAA CCCTCATCGC ACTCATCGGC 
1351 TTTGCCTTGT CCGCACTGCT GTTCGCCATC GTCTGA 

This corresponds to the amino acid sequence [<SEQ ID 586; ORF140-1>] (SEP TP NO: 586; 
PRF140-1) : 



1 MDGWTQTLSA QTLLGISAAA IILILILIVK FRIHALLTLV IVSLLTALAT 

51 GLPTGSIVND ILVKNFGGTL GGVALLVGLG AMLGRLV ETS GGAQSLADAL 

101 IRMFGEKRAP FALGVAS LIF GFPIFFDAGL IVML PIVFAT ARRMKQD VLP 

151 FALASIGAFS VMHV FLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF 

201 SGYMLGKVLG RTIHVPVPEL LSGGTQDNDL PKEPA KAGTV VAIMLIPMLL 

2 51 I FL NTGVS AL ISEKLVSADE TWVQTAKIIG S TPIALLISV LVALFVLG RK 

3 01 RGESGSALEK TVDGALAPVC SVILITGAGG MFGGVL RASG IGKALADSMA 

3 51 DLGIPVLLGC FLVALALRIA QGSATVALTT AAALMAPAVA AAGFTDWQLA 

4 01 CIVLATAAGS VGCSHFNDSG FWLVGRLLDM DVPTTLKTWT VNQT LIALIG 
4 51 FALSALLFAI V * 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N.meningitidis (strain A) 



ORF140 (SEP ID NO: 584) shows 95.4% identity over a 87aa overlap with an ORF (ORF140a) 
(SEP ID NO: 588) from strain A of N. meningitidis: 



10 20 30 40 50 60 

orf 14 0 .pep MDGWTQTLSAQTLLGISAAAIILILILIVRFRIHALLTLVIVSLLTALATGLPTGSIVKD 

II III II Mill III II II MM MM II II III Mill II II MM Mi MM 

orf 14 0a MDGWTQTLSAQTLLGI SAAAI I LI L I LI VKFRIHALLTLVI VSLLTALATGLPTGS I VND 

10 20 30 40 50 60 

70 80 
or f 14 0 . pep I LVKNFGGTL GGVALLVGLGAMLERLV 

: II I I II I I I I I I I I I II I I I I I III 
orf 14 0a VLVKNFGGTL GGVALLVGLGAMLGRLV ETSGGAQSLADALIRMFGEKRAPFALGVASLIF 

70 . 80 90 100 110 120 



The complete length PRF140a nucleotide sequence [<SEQ ID 587>] (SEP ID NP: 587) is: 



1 ATGGACGGCT GGACACAGAC GCTGTCCGCG CAAACCCTGT TGGGCATTTC 

51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAAA TTCCGCATCC 

101 ACGCGCTGCT GACACTGGTC ATCGTCAGCC TGCTGACGGC TTTGGCAACC 

151 GGTTTGCCCA CAGGCAGCAT TGTCAACGAC GTACTGGTCA AAAACTTCGG 

2 01 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGCCTGGGC GCGATGCTCG 

2 51 GACGTTTGGT CGAAACATCC GGCGGCGCAC AGTCGCTGGC GGACGCGCTG 

301 ATCCGGATGT TCGGCGAAAA ACGCGCACCG TTCGCGCTGG GCGTTGCCTC 

351 GCTGATTTTC GGCTTCCCGA TTTTCTTCGA TGCCGGACTA ATCGTCATGC 

4 01 TGCCCATCGT GTTCGCCACC GCACGGCGCA TGAAACAGGA CGTACTGCCC 

4 51 TTCGCGCTTG CCTCCATCGG CGCATTTTCC GTCATGCACG TCTTCCTGCC 

501 GCCCCATCCG GGCCCGATTG CCGCTTCCGA ATTTTACGGC GCGAACATCG 

551 GCCAAGTTTT GATTTTGGGT CTGCCGACCG CCTTCATCAC ATGGTATTTC 

601 AGCGGCTATA TGCTCGGCAA AGTGTTGGGG CGCACCATCC ATGTTCCCGT 

651 TCCCGAACTG CTCAGCGGCG GCACGCAAGA CAACGACCTG CCGAAAGAAC 
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701 CTGCCAAAGC AGGAACGGTC GTCGCCATCA TGCTGATTCC CATGCTGCTG 

751 ATTTTCCTGA ATACCGGCGT ATCGGCCCTC ATCAGCGAAA AACTCGTAAG 

801 TGCGGACGAA ACCTGGGTTC AGACGGCAAA AATAATCGGT TCGACACCGA 

851 TCGCCCTTCT GATTTCCGTA TTGGTCGCAC TGTTTGTCTT GGGACGCAAA 

5 901 CGCGGCGAAA GCGGCAGCGC GTTGGAAAAA ACCGTGGACG GCGCACTCGC 

951 CCCCGTCTGT TCCGTGATTC TGATTACCGG CGCGGGCGGT ATGTTCGGCG 

1001 GCGTTTTGCG CGCTTCCGGC ATCGGCAAGG CACTCGCCGA CAGCATGGCG 

1051 GATTTGGGCA TTCCCGTCCT TTTGGGCTGT TTCCTTGTCG CCTTGGCACT 

1101 GCGTATCGCG CAAGGTTCGG CAACCGTCGC CCTGACCACC GCCGCCGCGC 

10 1151 TGATGGCTCC TGCCGTTGCC GCCGCCGGCT TTACCGACTG GCAGCTCGCC 

1201 TGTATCGTAT TGGCAACGGC GGCAGGTTCG GTCGGTTGCA GCCACTTCAA 

1251 CGACTCCGGC TTCTGGCTGG TCGGCCGCCT CTTGGACATG GACGTACCGA 

1301 CCACGCTGAA AACCTGGACG GTCAACCAAA CCCTCATCGC ACTCATCGGC 

1351 TTTGCCTTGT CCGCACTGCT GTTCGCCATC GTCTGA 



15 



This encodes a protein having amino acid sequence [<SEQ ID 588>] (SEP ID NO: 588) : 



1 MDGWTQTLSA QTLLGISAAA IILILILIVK FRIHALLTLV IVSLLTALAT 

51 GLPTGSIVND VLVKNFGGTL GGVALLVGLG AMLGRLV ETS GGAQSLADAL 

101 IRMFGEKRAP FALGVAS LIF GFPIFFDAGL IVML PIVFAT ARRMKQDVLP 

20 151 FALASIGAFS VMHV FLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF 

201 SGYMLGKVLG RTIHVPVPEL LSGGTQDNDL PKEPA KAGTV VAIMLIPMLL 

251 IFLN TGVSAL ISEKLVSADE TWVQTAKIIG S TPIALLISV LVALFVLG RK 

301 RGESGSALEK TVDGALAPVC SVILITGAGG MFGGVL RASG IGKALADSMA 

351 DLGIPVLLGC FLVALALRIA QGSATVALTT AAALMAPAVA AAGFTDWQLA 

25 4 01 CIVLATAAGS VGCSHFNDSG FWLVGRLLDM DVPTTLKTWT VNQT LIALIG 

451 FALSALLFAI V * 

ORF140a (SEP ID NO: 588) and PRF140-1 (SEP ID NO: 586) show 99.8% identity over a 461 aa 
overlap: 



30 orf 140-1 .pep MDGWTQTLSAQTLLGISAAAI ILILILIVKFRIHALLTLVIVSLLTALATGLPTGSIVND 60 

I I M I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I ! I I 
orf 14 0a MDGWTQTLSAQTLLGISAAAI ILILILIVKFRIHALLTLVIVSLLTALATGLPTGSIVND 60 

orf 140-1 .pep I LVKNFGGTLGGVALLVGLGAMLGRL VETS GGAQSLADAL I RMFGEKRAP FALGVASL I F 120 
: I M I : I I II I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I ! I I I I I I I I I I M I I I I I I 
35 orf 140a VLVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADAL I RMFGEKRAP FALGVASL IF 120 

orf 140-1 .pep GFP I FFDAGL I VMLP I VFATARRMKQDVLPFALAS IGAFS VMHVFLPPHPGP I AASEFYG 180 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 M I M 1 1 1 1 1 1 1 1 1 , 1 1 

orf 14 0a GFPIFFDAGLIVMLP I VFATARRMKQDVLPFALAS IGAFSVMHVFLPPHPGPIAASEFYG 810 

orf 140-1 .pep ANIGQVLILGLPTAFITWYFSGYMLGKVLGRTIHVPVPELLSGGTQDNDLPKEPAKAGTV 240 

40 || MM MINI Ml || || || | M MINI 1 1 1 1 II I II 1 1 1 1 1 1 1 II MINIMI I 

orf 14 0a ANIGQVLI LGLPTAF I TWYFSGYMLGKVLGRT I HVP VP ELLS GGTQDNDL PKEPAKAGTV 240 

orf 140-1 .pep VAIMLIPMLL IFLNTGVSAL I SEKLVSADETWVQTAKI I GSTP I ALL I SVLVALFVLGRK 300 

I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 M I 

orf 14 0a VAIMLIPMLL IFLNTGVSAL I SEKLVSADETWVQTAKI I GSTP I ALL I SVLVALFVLGRK 3 00 

45 orf 140-1. pep RGESGSALEKTVDGALAPVCSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGC 360 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 E 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 14 0a RGESGS ALEKTVDGALAPVCS VI L I TGAGGMFGGVLRASG I GKALADSMADLG I PVLLGC 360 



orf 140-1 .pep 



FLVALALRIAQGSATVALTTAAALMAPAVAAAGFTDWQLACIVLATAAGSVGCSHFNDSG 420 

II I M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 II M 1 1 1 M 1 1 M M M I M I 
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or f 14 0a FLVALALRI AQGSATVALTTAAALMAPAVAAAGFTDWQLACI VLATAAGSVGCSHFNDSG 420 

orf 140-1. pep FWLVGRLLDMDVPTTLKTWTVNQTL I AL I GFALS ALLFA I V '461 

M I I I M I I I I I I I Ml I I II M I I I II ll I I I I I I I II I I 
orf 14 0a FWLVGRLLDMDVPTTLKTWTVNQTL I AL I GFALS ALL FA IV 461 

Homology with a predicted ORF from N. gonorrhoeae 

ORF140 (SEP ID NO: 584) shows 92% identity over a 87aa overlap with a predicted ORF 
(ORF140ng) (SEP ID NO: 590) from N. gonorrhoeae: 

orf 140 . pep MDGWTQTLSAQTLLGISAAAIILILILIVRFRIHALLTLVIVSLLTALATGLPTGSIVKD 60 

Ml III MIMMMMIIMMIMM IMIIIIIIMMMII IIMIIIMM 
orf 14 0ng MDGRTQTLSAQTLLGISAAAIILILILIVKFRIRALLTLVIASLLTALATGLPTGSIVND 60 

or f 14 0 . pep I LVKNFGGTLGGVALLVGLGAMLERLV 8 7 

M MM MIMMMIMM! Ml 
orf 14 Ong VLVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFAPGVASLI F 120 

The complete length PRF140ng nucleotide sequence [<SEQ ID 589>] (SEP ID NP: 589) was 
predicted to encode a protein having amino acid sequence [<SEQ ID 590>] (SEP ID NP: 590) : 

1 MDGRTQTLSA QTLLGISAAA IILILILIVK FRIRALLTLV IASLLTALAT 

51 gLPTGSIVND VLVKNFGGTL GGVALLVGLG AMLGRLV ETS GGAQSLADAL 

101 IRMFGEKRAP FAPGVAS LIF GFPIFFDAGL IVML PIVFAT ARRMKQDVLP 

151 FALASVGAFS VMHV FLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF 

201 SGYMLGKVLG RAIHVPVPEL LSGGTQDSDP PKEPA KAGTV VAVMLIPMLL 

251 I FLN TGVS AL ISEKLVSADE TWVQTAKMIG S TPVALLISV LAALLVLG RK 

301 RGESGSTLEK TVDGALAPA C SVILITGAGG MFGGVL RASG IGKALADSMA 

3 51 DLGIPVLLGC FLVALALRI A QGSATVALTT AAALMAPAVA AAGFTDWQLA 

4 01 CIVLATAAGS VGCSHFNDSG FWLVGRLSDM DVPTTLKTWT VNQT LIAFIG 
451 FALSALLFAI V* 

Further work revealed a variant gonococcal DNA sequence [<SEQ ID 59 1>] (SEP ID NP: 591) : 



1 ATGGACGGCC GGACACAGAC GCTGTCCGCG CAAACCTTGT TGGGCATTTC 

51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAAA TTCCGCATCC 

101 GCGCGCTGCT GACACTGGTC ATCGCCAGCC TGCTGACGGC TTTGGCAACC 

151 GGTTTGCCCA CAGGCAGCAT CGTCAACGAC GTACTGGTCA AAAACTTCGG 

201 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGTCTGGGC GCAATGCTCG 

•2 51 GACGTTTGGT AGAAACATCC GGCGGCGCAC AGTCGCTGGC GGACGCGCTG 

3 01 ATCCGGATGT TCGGCGAAAA ACGCGCACCG TTCGCTCCGG GCGTTGCCTC 
351 GCTGATTTTC GGCTTCCCGA TTTTCTTCGA TGCCGGACTA ATCGTCATGC 

4 01 TGCCCATCGT ATTCGCCACC GCACGGCGCA TGAAACAGGA CGTACTGCCC 
4 51 TTCGCGCTTG CCTCCGTCGG CGCATTTTCC GTCATGCACG TCTTCCTGCC 
501 GCCCCATCCG GGCCCGATTG CCGCTTCCGA ATTTTACGGC GCGAACATCG 
551 GCCAGGTTTT GATTTTGGGT CTGCCGACCG CCTTCATCAC ATGGTATTTC 
601 AGCGGCTATA TGCTCGGCAA AGTGTTGGGG CGCGCCATCC ATGTTCCCGT 
651 TCCCGAACTG CTCAGCGGCG GCACGCAAGA CAGCGACCCG CCGAAAGAAC 
701 CTGCCAAAGC AGGAACGGTC GTCGCCGTCA TGCTGATTCC CATGCTGCTG 
751 ATTTTCCTGA ATACCGGCGT ATCAGCCCTC ATCAGCGAAA AACTCGTAAG 
801 TGCGGACGAA ACTTGGGTTC AGACGGCAAA AATGATCGGT TCGACACCTG 
851 TCGCCCTTCT GATTTCCGTA TTGGCCGCAC TGTTGGTCTT GGGACGCAAA 
901 CGCGGCGAAA GCGGCAGCAC GTTGGAAAAA ACCGTGGACG GCGCACTCGC 
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951 CCCCGCCTGT TCCGTGATTC TGATTACCGG CGCGGGCGGT ATGTTCGGCG 

1001 GCGTTTTGCG CGCTTCCGGC ATCGGCAAGG CACTCGCCGA CAGCATGGCG 

1051 GATTTGGGCA TTCCCGTCCT TTTGGGCTGC TTCCTTGTCG CCTTGGCACT 

1101 GCGTATCGCG CAAGGTTCGG CAACCGTCGC CCTGACCACA GCCGCCGCGC 

1151 TGATGGCTCC TGCCGTTGCC GCCGCCGGCT TTACCGACTG GCAGCTCGCC 

1201 TGTATCGTAT TGGCAACGGC GGCAGGTTCG GTCGGTTGCA GCCACTTCAA 

1251 CGACTCCGGC TTCTGGCTGG TCGGCCGCCT CTTGGATATG GACGTACCGA 

1301 CCACGCTGAA AACCTGGACG GTCAACCAAA CCCTCATCGC ATTCATCGGC 

13 51 TTTGCCTTGT CCGCACTGCT GTTTGCCATC GTCTGA 



This corresponds to the amino acid sequence [<SEQ ID 592; ORF140ng-l>] (SEP ID NO: 592; 
ORF140ng-n : 



1 MDGRTQTLSA QTLLGISAAA IILILILIVK FRIRALLTLV IASLLTALAT 

51 GLPTGSIVND VLVKNFGGTL GGVALLVGLG AMLGRLV ETS GGAQSLADAL 

101 IRMFGEKRAP FAPGVAS LIF GFPIFFDAGL IVML PIVFAT ARRMKQD VLP 

151 FALASVGAFS VMHV FLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF 

201 SGYMLGKVLG RAIHVPVPEL LSGGTQDSDP PKEPA KAGTV VAVMLIPMLL 

251 I FLNTGVS AL ISEKLVSADE TWVQTAKMIG S TPVALLISV LAALLVLG RK 

301 RGESGSTLEK TVDGALAPAC SVILITGAGG MFGGVL RASG IGKALADSMA 

3 51 DLGIPVLLGC FLVALALRIA QGSATVALTT AAALMAPAVA AAGFTDWQLA 

4 01 CIVLATAAGS VGCSHFNDSG FWLVGRLLDM DVPTTLKTWT VNQT LIAFIG 
451 FALSALLFAI V * 

PRF140ng-l (SEP ID NO: 592) and PRF140-1 (SEP ID NO: 586) show 96.3% identity over 
461 aa overlap: 



orf 14 0ng-l :pep MDGRTQTLSAQTLLGISAAAIILILILIVKFRIRALLTLVIASLLTALATGLPTGSIVND 

III I M i I II I M 1 1 1 1 1 , II 1 1 M 1 1 1 1 h I M 1 1 1 h I M II 1 1 1 1 1 1 1 1 1 1 M 

orf 140-1 MDGWTQTLS AQTLLG I SAAAI I L I L I L I VKFRI HALLTLV I VSLLTALATGLPTGS I VND 

orf 140ng- 1 . pep VLVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFAPGVASLIF 

:|| II II llllll IIMII lllllllllllll IIIIIIMIIIIIIM lllllll 

orf 140 - 1 ILVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFALGVASLIF 
orf 14 0ng-l .pep GFP I FFDAGLI VMLP I VFATARRMKQDVLPFALASVGAFSVMHVFLPPHPGP I AASEFYG 

1 1 1 1 1 i 1 1 M 1 1 1 h 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 h 1 1 1 1 M I II 1 1 1 1 M II 1 1 M 1 1 1 

orf 140-1 GFP I FFDAGLI VMLP I VFATARRMKQDVLPFALAS I GAFSVMHVFLPPHPGP I AASEFYG 

orf 14 0ng-l .pep AN I GQVL I LGLPTAF I TWYFSGYMLGKVLGRA I HVP VPELLSGGTQDSDPPKEP AKAGTV 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I : I I M I I I I I I I I I h I llllllllll 
orf 140 - 1 ANIGQVL I LGLPTAF I TWYFSGYMLGKVLGRT I HVP VPELLSGGTQDNDL PKEPAKAGTV 

orf 140ng-l .pep VAVML I PMLL I FLNTGVS AL I S EKLVS ADETWVQTAKM IGSTPVALL I S VLAALLVLGRK 

I : I I I h I I I I I I ' I I I I I I I I I I I I I I I I I I I I h I I I I hi I I I I I h I hi I M 
orf 140-1 VAIMLIPMLLIFLNTGVSALISEKLVSADETWVQTAKIIGSTPIALLISVLVALFVLGRK 

orf 14 0ng-l .pep RGESGSTLEKTVDGALAPACSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGC 
I I I I I h I I I I I I I I I I :| II ! I I M I , I I I I II M I I I I I I I I I I I I I I I I : I I I I I 
orf 140 - 1 RGESGS ALEKTVDGALAP VCSV I L I TGAGGMFGGVLRASG I GKALADSMADLG I PVLLGC 

orf 140ng- 1 . pep FLVALALR I AQGSATVALTTAAALMAPAVAAAGFTDWQLAC I VLATAAGS VGCSHFNDSG 

I I M I . I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I I M I I I I I 
or f 14 0 - 1 FLVALALR I AQGSATVALTTAAALMAPAVAAAGFTDWQ LAC I VLATAAGS VGCSHFNDSG 
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orf 140ng- 1 . pep FWLVGRLLDMDVPTTLKTWTVNQTL I AF I GFALS ALLFAI V 

I I I I I I I I I I I M I I I I I I I I M Ih I I I I I M I I I I I I 
orf 14 0-1 FWLVGRLLDMDVPTTLKTWTVNQTL IALIGFALSALLFAIV 

5 Furthermore, ORF140ng-l (SEP ID NO: 592) is homologous to an Kcoli protein (SEP ID NO: 
1148) : 

gi|882633 (U29579) ORF_o454 [Escherichia coli] )gi|l789097 (AE000358) o454; 
This 454 aa ORF is 34% identical (9 gaps) to 444 residues of an approx. 456 aa 
protein GNTP_BACLI SW: P4 6832 [Escherichia coli] Length = 454 
10 Score = 210 bits (529), Expect = le-53 

Identities = 130/384 (33%), Positives = 194/384 (49%), Gaps = 19/384 (4%) 
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139 


15 


Query: 


148 


VLPFALASVGAFSVMHVFLPPHPGPIAASEFYGANIGQVLILGLPTAFITWYFSGYMLGK 


207 








L F L G +HV +PPHPGP+AA+ A+IG + I+G+ +1 GY K 






Sbjct : 


140 


PLKFGLPVAG IMLTVHVAVPPHPGPVAAAGLLHAD I GWLT I IGI AIS - 1 PVGWGYFAAK 


198 




Query: 


208 


VLGRAIHVPVPELL SGGTQDSDPPKEPAKAGTWAVMLI PMLL I FLNTGV 


257 








+ + + + E+L G T+ SD P A V ++++IP+ +1 T 




20 


Sbjct: 


199 


I INKRQYAMSVEVLEQMQLAPASEEGATKLSDKINPPGVA- LVTSLI VI PI AI IMAGT- - 


255 




Query: 


258 


SALISEKLVSADETWVQTAKMIGSTPXXXXXXXXXXXXXXGRKRGESGSTLEKTVDGALA 


317 








+S L+ + T ++IGS +RG S + AL 






Sbjct: 


256 


VSATLMPPSHPLLGTLQLIGSPMVALMIALVLAFWLLALRRGWSLQHTSDIMGSALP 


312 


25 


Query: 


318 


PACSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGCFLVALALRIAQGSXXXX 


377 






A VIL+TGAGG+FG VL SG+GKALA+ + + +P+L F+++LALR +QGS 






Sbjct: 


313 


TAAVVILVTGAGGVFGKVLVESGVGKALANMLQMIDLPLLPAAFIISLALRASQGS--AT 


370 




Query: 


378 


XXXXXXXXXXXXXXXGFTDWQLACIVLATAAGSVGCSHFNDSGFWLVGRLLDMDVPTTLK 


437 








G Q + LA G +G SH NDSGFW+V + L + V LK 






Sbjct: 


371 


VAILTTGGLLSEAVMGLNPIQCVLVTLAACFGGLGASHINDSGFWIVTKYLGLSVADGLK 


430 


30 


Query: 


438 


TWTVNQTL I AF I GFALS ALLFAI V 461 










TWTV T++ F GF + + ++A++ 






Sbjct: 


431 


TWTVLTTILGFTGFLITWCVWAVI 4 54 1 





Based on this analysis, including the identification of the presence of a putative leader sequence 
35 (double-underlined) and several putative transmembrane domains (single-underlined) in the 
gonococcal protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 71 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 593>] (SEP ID 
40 NO: 593) : 
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1 . . GATTTCGGCA TATCGCCCGT 

51 TTTGCTGTCG CCGTGGGCTG 

101 GCGTATTTTT TGCCGTTATC 

151 AACTTTTTGG GCAGACACCA 

201 CTGTATCGGG CTGATTCCAG 

251 CCTTTGCCGC CGCCGGACTG 

301 CGCGTGATTG CCGCCTCTTT 

351 GTTGGCAGCA GCTTATCCGG 

4 01 TACTGATGTT TTTCCGTCCG 



GTATCTTTGG GTTGCCGCCG CGTTGAAACA 
CCGACTCATA CGATGTCGCA CGCTTTGCAG 
GGACTGACTT CCTGCGGCTT TGCCGGTTTC 
CGGGCGCAC . GTCGTCCTGA TTCTCATCGG 
TTGCCCATTT CCTCAACCCC GCTGCCGCCG 
GTGCTGCACG GTTATTCTTT GGCTCGCCGG 
TCTGCTCGGT ACGGGCTGGA CGCTGATGTC 
CAGCATTTGC CCTGATGCTG CCCTTGCCCG 



This corresponds to the amino acid sequence [<SEQ ID 594; ORF141>] (SEP ID NO: 594; 
PRF14U : 



1 . .DFGISPVYLW VAAAFKHLLS PWAADSYDVA RFAGVFFAVI GLTSCGFAGF 
51 NFLGRHHGRX WLILIGCIG LIPVAHFLNP AAAAFAAAGL VLHGYSLARR 
101 RVIAASFLLG TGWTLMSLAA AYPAAFALML PLPVLMFFRP . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 595>] (SEP ID NO: 595) : 



1 ATGCTGACCT ATACCCCGCC CGATGCCCGC CCGCCCGCCA AAACCCACGA 

51 AAAGCCGTGG CTGCTGCTGT TGATGGCGTT TGCCTGGTTG TGGCCCGGCG 

101 TGTTTTCCCA CGATTTGTGG AATCCTGACG AACCTGCCGT CTATACCGCC 

151 GTCGAAGCAC TGGCAGGCAG CCCCACCCCC TTGGTTGCCC ATCTGTTCGG 

201 TCAAACCGAT TTCGGCATAC CGCCCGTGTA TCTTTGGGTT GCCGCCGCGT 

251 TCAAACATTT GCTGTCGCCG TGGGCTGCCG ACTCATACGA TGCCGCACGC 

301 TTTGCAGGCG TATTTTTTGC CGTTATCGGA CTGACTTCCT GCGGCTTTGC 

3 51 CGGTTTCAAC TTTTTGGGCA GACACCACGG GCGCAgCGTC GTCCTGATTC 

4 01 TCATCGGCTG TATCGGGCTG ATTCCAGTTG CCCATTTCCT CAACCCCGCT 
4 51 GCCGCCGCCT TTGCCGCCGC CGGACTGGTG CTGCACGGTT ATTCTTTGGC 
501 TCGCCGGCGC GTGATTGCCG CCTCTTTTCT GCTCGGTACG GGCTGGACGC 
551 TGATGTCGTT GGCAGCAGCT TATCCGGCAG CATTTGCCCT GATGCTGCCC 
601 TTGCCCGTAC TGATGTTTTT CCGTCCGTGG CAAAGCAGGC GTTTGATGTT 
651 GACGGCAGTC GCCTCACTTG CCTTTGCCCT GCCGCTTATG ACCGTTTACC 
701 CGCTGCTCTT GGCAAAAACG CAGCCCGCGC TGTTCGCGCA ATGGCTCGAC 
751 TATCACGTTT TCGGTACGTT CGGCGGCGTG CGGCACGTTC AGACGGCATT 
801 CAGTTTGTTT TACTATCTGA AAAACCTGCT TTGGTTTGCA TTGCCCGCGC 
851 TGCCGCTGGC GGTTTGGACG GTTTGCCGCA CGCGCCTGTT TTCGACCGAC 
901 TGGGGGATTT TGGGCGTCGT CTGGATGCTT GCCGTTTTGG TGCTGCTTGC 
951 CGTCAATCCG CAGCGTTTTC AGGATAACCT CGTCTGGCTG CTTCCGCCGC 

1001 TTGCCCTGTT CGGCGCGGCG CAACTGGACA GCCTGAGGCG CGGCGCGGCG 

1051 GCGTTTGTCA ACTGGTTCGG CATTATGGCG TTCGGACTGT TTGCCGTGTT 

1101 CCTGTGGACG GGCTTTTTCG CCATGAATTA CGGCTGGCCC GCCAAGCTTG 

1151 CCGAACGCGC CGCCTATTTC AGCCCGTATT ATGTTCCTGA TATCGATCCC 

1201 ATTCCGATGG CGGTTGCCGT ACTGTTCACA CCCTTGTGGC TGTGGGCGAT 

1251 TACCCGGAAA AACATACGCG GCAGGCAGGC GGTTACCAAC TGGGCGGCAG 

1301 GCGTTACCCT GACCTGGGCT TTGCTGATGA CGCTGTTCCT GCCGTGGCTG 

1351 GACGCGGCGA AAAGCCACGC GCCGGTCGTC CGGAGTATGG AGGCATCGCT 

1401 TTCCCCGGAA TTGAAACGGG AGCTTTCAGA CGGCATCGAG TGTATCGGCA 

1451 TAGGCGGCGG CGACCTGCAC ACGCGGATTG TTTGGACGCA GTACGGCACA 

1501 TTGCCGCACC GCGTCGGCGA TGTACAATGC CGCTACCGCA TCGTCCTCCT 

1551 GCCCCAAAAT GCGGATGCGC CGCAAGGCTG GCAGACGGTT TGGCAGGGTG 

1601 CGCGTCCGCG CAACAAAGAC AGTAAGTTCG CACTGATACG GAAAATCGGG 

1651 GAAAATATAT AA 



This corresponds to the amino acid sequence [<SEQ ID 596; ORF141-l>] (SEP ID NO: 596: 
ORF141-1) : 
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1 MLTYTPPDAR PPAKTHEKPW LLLLMAFAWL WPGVFSHDLW NPDEPAVYTA 
51 VEALAGS PTP LVAHLFGQTD FGIPPVYLWV AAAFKHLLSP WAADSYDAAR 
101 FAGVFFAVIG LTSCGFAGFN FLGRHHGRSV VLILIGCIGL IPVAHFLNPA 
151 AAAFAAAGLV LHGYSLARRR VIAASFLLGT GWTLMSLA AA YPAAFALMLP 
5 201 LPVLMFFRPW QSRRL MLTAV ASLAFALPLM TVY PLLLAKT QPALFAQWLD 

251 YHVFGTFGGV RHVQTAFSLF YYLKNLLWFA LPALPLAVWT VCRTRLFSTD 
301 W GILGWWML AVLVLLAVN P QRFQDNLVWL LPPLALFGAA QLDSLRRGAA 

3 51 AFVNWFGIMA FGLFAVFLWT GFFAMNYGWP AKLAERAAYF SPYYVPDIDP 

4 01 IPMAVAVLFT PLWLWAITRK NIRGRQAVTN WAAGVTLTWA LLMTLFLPWL 
10 4 51 DAAKSHAPW RSMEASLSPE LKRELSDGIE CIGIGGGDLH TRIVWTQYGT 

501 LPHRVGDVQC RYRIVLLPQN ADAPQGWQTV WQGARPRNKD SKFALIRKIG 
551 ENI* 

Computer analysis of this amino acid sequence gave the following results: 

15 Homology with a predicted ORF from N. meningitidis (strain A) 

ORF141 (SEP ID NO: 594) shows 95.0% identity over a 140aa overlap with an ORF (ORF141a) 
(SEP ID NO: 598) from strain A of AT. meningitidis: 

10 20 30 

orf 141 .pep DFGISPVYLWVAAAFKHLLSPWAADSYDVA 

20 I I I I I I II I I I I I I I I I I I I I I I I MM 

orf 141a WNPDE PAVYTAVEALAGS PTPLVAHLFGQ IDFG I P PVYLWVAAAFKHLLS PWAADP YDAA 

40 50 60 70 80 90 

40 50 60 70 80 90 

or f 14 1 . pep R FAGVFFAVIGLTSCGFA GFNFLGRHHGRX WLILIGCIGLIPVAHF LNPAAAAFAAAGL 

25 II llllllhlll MINIM MINI II 1 1 1 1 1 II 1 1 II 1 1 : M 1 1 1 1 1 1 1 1 1 1 1 1 II 

or f 14 la R FAGVFFA WGLTS CGFA GFNFLGRHHGRS WL I L I GC IGL I PTVHF LNP AAAAFAAAGL 

100 110 120 130 140 150 

100 110 120 130 140 

orf 141. pep VLHG Y S LARRR V I AAS FLLGTGWTLM S L AAA Y P AAF ALML PL P VLM F F RP 

30 1 1| 1 1| 1 1| || 1 1| M II I II M M II 1 1 1 1 1 II I II II 1 1 M I II I M I 

orf 141a VLHGYSLARRRVIAASFLLGTGWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTA 
160 170 180 190 200 210 
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orf 141a VASLAFALPLMTV YPLLLAKTQPALFAQWLDDHVFGTFGGVRHIQTAFSLFYYLKNLLWF 
220 230 240 250 260 270 

The complete length ORF141 a nucleotide sequence [<SEQ ID 597>] (SEPIDNP: 597) is: 



1 ATGCTGACCT ATACCCCGCC CGATGCCCGC CCGCCCGCCA AAACCCACGA 

51 AAAGCCGTGG CTGTTGCTGT TGATGGCGTT TGCCTGGTTG TGGCCCGGCG 

101 TGTTTTCCCA CGATTTGTGG AATCCTGACG AACCTGCCGT CTATACCGCC 

40 151 GTCGAAGCAC TGGCAGGCAG CCCCACCCCT TTGGTTGCCC ATCTGTTCGG 

201 TCAAATCGAT TTCGGCATAC CGCCCGTGTA TCTTTGGGTT GCCGCCGCGT 

2 51 TCAAACATTT GCTGTCGCCG TGGGCTGCCG ACCCGTATGA TGCCGCACGC 

301 TTTGCCGGCG TGTTTTTCGC CGTTGTCGGA CTGACTTCCT GCGGCTTTGC 

351 CGGTTTCAAC TTTTTGGGCA GACACCACGG GCGCAGCGTC GTCCTGATTC 

45 4 01 TCATCGGCTG TATCGGGCTG ATTCCGACCG TACACTTTCT CAACCCCGCT 

4 51 GCCGCCGCCT TTGCCGCCGC CGGACTGGTG CTGCACGGTT ATTCTTTGGC 

501 TCGCCGGCGC GTGATTGCCG CCTCTTTTCT GCTCGGTACG GGTTGGACGC 

551 TGATGTCGTT GGCAGCAGCT TATCCGGCGG CATTTGCCCT GATGCTGCCC 
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601 CTGCCCGTGC TGATGTTTTT CCGTCCGTGG CAAAGCAGGC GTTTGATGTT 

651 GACGGCAGTC GCCTCGCTTG CCTTTGCCCT GCCGCTTATG ACCGTTTACC 

701 CGCTGCTCTT GGCAAAAACG CAGCCCGCGC TGTTCGCGCA ATGGCTCGAC 

751 GATCACGTTT TCGGTACGTT CGGCGGCGTG CGGCACATTC AGACGGCATT 

5 801 CAGTTTGTTT TACTATCTGA AAAACCTGCT TTGGTTTGCA TTGCCTGCGC 

851 TGCCGCTGGC GGTTTGGACG GTTTGCCGCA CGCGCCTGTT TTCGACCGAC 

901 TGGGGGATTT TGGGCGTCGT CTGGATGCTT GCCGTTTTGG TGCTGCTTGC 

951 CGTCAATCCG CAGCGTTTTC AGGATAACCT CGTCTGGCTG CTTCCGCCGC 

1001 TTGCCCTGTT CGGCGCGGCG CAACTGGACA GCCTGAGACG CGGCGCGGCG 

10 1051 GCGTTTGTCA ACTGGTTCGG CATTATGGCG TTCGGACTGT TTGCCGTGTT 

1101 CCTGTGGACG GGCTTTTTCG CCATGAATTA CGGCTGGCCC GCCAAGCTTG 

1151 CCGAACGCGC CGCCTATTTC AGCCCGTATT ATGTTCCTGA TATCGATCCC 

1201 ATTCCGATGG CGGTTGCCGT ACTGTTCACA CCCTTGTGGC TGTGGGCGAT 

1251 TACCCGCAAA AACATACGCG GCAGGCAGGC GGTTACCAAC TGGGCGGCAG 

15 1301 GCGTTACCCT GACCTGGGCT TTGCTGATGA CGCTGTTCCT GCCGTGGCTG 

1351 GACGCGGCGA AAAGCCACGC GCCCGTCGTC CGGAGTATGG AGGCATCGCT 

14 01 TTCCCCGGAA TTAAAACGGG AGCTTTCAGA CGGCATCGAG TGTATCGACA 

1451 TAGGCGGCGG CGACCTACAC ACGCGGATTG TTTGGACGCA GTACGGCACA 

1501 TTGCCGCACC GCGTCGGCGA TGTACAATGC CGCTACCGCA TCGTCCGCTT 

20 1551 GCCCCAAAAC GCGGATGCGC CGCAAGGCTG GCAGACGGTC TGGCAGGGTG 

1601 CGCGCCCGCG CAACAAAGAC AGTAAGTTCG CACTGATACG GAAAACCGGG 

1651 GAAAATATAT TAAAAACAAC AGATTGA 

This encodes a protein having amino acid sequence [<SEQ ID 598>] (SEP ID NO: 598) : 

25 1 MLTYTPPDAR PPAKTHEKPW LLLLMAFAWL WPGVFSHDLW NPDEPAVYTA 

51 VEALAGSPTP LVAHLFGQID FGIPPVYLWV AAAFKHLLSP WAADPYDAAR 

101 FAGVFFAWG LTSCGFAGFN FLGRHHGRSV VLILIGCIGL IPTVHFLNPA 

151 AAAFAAAGLV LHGYSLARRR VIAASFLLGT GWTLMSLA AA YPAAFALMLP 

201 LPVLMFF RPW QSRRL MLTAV ASLAFALPLM TV YPLLLAKT QPALFAQWLD 

30 251 DHVFGTFGGV RHIQTAFSLF YYLKNLLWFA LPALPLAVWT VCRTRLFSTD 

301 W GILGWWML AVLVLLAVN P QRFQDNLVWL LPPLALFGAA QLDSLRRGAA 

351 AFVNWFGIMA FGLFAVFLWT GFFAMNYGWP AKLAERAAYF SPYYVPDIDP 

401 I PMAVAVL FT PLWLWAITRK NIRGRQAVTN WAAGVTLTWA LLMTLFLPWL 

4 51 DAAKSHAPW RSMEASLSPE LKRELSDGIE CIDIGGGDLH TRIVWTQYGT 

35 501 LPHRVGDVQC RYRIVRLPQN ADAPQGWQTV WQGARPRNKD SKFALIRKTG 

55l' ENILKTTD* 

ORF141a (SEP ID NO: 598) and ORF141-1 (SEP ID NO: 596) show 98.2% identity in 553 aa 
overlap: 



40 orf 14 la. pep MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPDEPAVYTAVEALAGSPTP 

Mill IMIMMIIIIIIIIIIIIIIII IIIIIIIIMMMMIilllllMI II 

orf 141-1 MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPDEPAVYTAVEALiAGSPTP 
orf 14 la. pep LVAHLFGQIDFGIPPVYLWVAAAFKHLLSPWAADPYDAARFAGVFFAWGLTSCGFAGFN 

Illlllll III IIIMIIIIIIIIIIII Ml 1 1 1 M 1 1 1 II M M I M 1 1 1 1 M 

45 orf 14 1 - 1 LVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAADSYDAARFAGVFFAVIGLTSCGFAGFN 

orf 14 la. pep FLGRHHGRS WL I L I GC IGL I PTVHFLNP AAAAFAAAGLVLHGYSLARRRV I AAS FLLGT 

II I I I I I M I I I I I I I I I M -I M I II I I I I I I II M I I II I Ml I I I I I I M I 
or f 1 4 1 - 1 FLGRHHGRS WL I L I GC IGL I PVAHFLN P AAAAFAAAGLVLHGYSLARRRV I AAS FLLGT 
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orf 141a .pep 
orf 141-1 



GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT 

IMIIIIIIII MIMMMIMIMM MMMM Illlllll Illlllll II 
GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT 
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orf 14 la . pep QPALFAQWLDDHVFGTFGGVRHIQTAFSLFYYLKNLLWFALPALPLAVWTVCRTRLFSTD 

Illlllllll llllllllllhllllllllll Illlllll IIIIMI lllllllllll 

or f 14 1 - 1 QPALFAQWLDYHVFGTFGGVRHVQTAFSLFYYLKNLLWFALPALPLAVWTVCRTRLFSTD 
or f 14 la . pep WGILGVVWMI^VLVLI^WPQRFQDNLVW^ 

I II 1 1 ! I 1 1 1 1 1 II 1 1 1 II II II .11 1 1 1 1 IM II II II IMI 1 1 1 1 Ml MINI I! I 

or f 14 1-1 WGILGVWMI^VLVLLAWPQRFQDNLVWLLPPI^LFGAAQLDSLRRGAAAFVNWFGIMA 
orf 14 la. pep FGLFAVFLWTGFFAJyiNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1! 1 1 III 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 141- 1" FGLFAVFLWTGFFAMNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK 
orf 141a . pep NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPWRSMEASLSPELKRELSDGIE 

Mill 1 1 1 II II I Illlllllll IIIIIIIIMIIIIIIIII MM 1 1 1 II II IIIIMI 

orf 14 1- 1 NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPWRSMEASLSPELKRELSDGIE 

orf 14 la . pep CIDIGGGDLHTRIVWTQYGTLPHRVGDVQCRYRIVRLPQNADAPQGWQTVWQGARPRNKD 

II II I I I II I I I I II I I I I I I II II II I I I I I I I M I I I I 1 I I I M ! I I I I I I I I I I I 
or f 14 1 - 1 C I G I GGGDLHTR I VWTQYGTLPHRVGDVQCRYR I VLLPQNADAPQGWQTVWQGARPRNKD 



orf 141a .pep 
orf 141-1 



S KFAL I RKTGEN I 

IIIIMI MM 

SKFALIRKIGENI 



Homology with a predicted ORF from N. gonorrhoeae 

20 ORF141 (SEP ID NO: 594) shows 95% identity over a 140aa overlap with a predicted ORF 
(ORFUlng) (SEP ID NO: 600) from N. gonorrhoeae: 



25 



orf 141 .pep 
orf 141ng 



orf 141 . pep 



orf 141ng 



DFGISPVYLWVAAAFKHLLSPWAADSYDVA 3 0 

MM I I II I M I I I I I I I i I I I Ihl 

WNPAE PAVYTAVEALAGS PTPLVAHLFGQTDFG I P P VYLWVAAAFKHLLS PWAAHP YDAA 126 

RF AGVFFAVIGLTSCGFAGFNFLGRHHGRXWL I L I GC I GL I P VAHFLNP AAAAFAAAGL 90 

II 1 1 1 1 II 1 1 1 1 1 1 1 M I II 1 1 1 1 II I M MM IIIIMIIIIIhlllllMMMI 

RFAGVFFAVIGLTSCGFAGFNFLGRHHGRSWLIHIGCIGLIPVAHFFNPAAAAFAAAGL 186 



VLHGYSLARRRVIAASFLLGTGWTLMSLAAAYPAAFALMLPLPVLMFFRP 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II I 1 1 1 1 II I M 1 1 1 1 1 1 1 

VLHGYSLARRRVIAASFLLGTGWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTA 



140 



orf 141 .pep 

30 orf 14 ing VLHGYSLARRRVIAASFLLGTGWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTA 24 6 

An PRFHlng nucleotide sequence [<SEQ ID 599>] (SEP ID NP: 599) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 600>] (SEP ID NP: 600) : 



35 



40 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 



MPSEAVSARP 
LMAFAWLWPG 
PPVYLWVAAA 
RHHGRSWLI 



LCEYLLHLAI RPFLLTLMLT YTPPDARPPA KTHEKP WLLL 
VFS HDLWNPA EPAVYTAVEA LAGSPTPLVA HLFGQTDFGI 
FKHLLSPWAA HPYDAAR FAG VFFAVIGLTS CGFA GFNFLG 
HIGCIGLIPV AHFFNPAAAA FAAAGLVLHG YSLARRRVIA 



ASFLLGTGWT LMSLAAAYPA AFALMLPLPV LMFFRPWQSR RLMLTAVASL 
AFALPLMTV Y PLLLAKTQPA LFAQWLNYHV FGTFGGVRHI QRAFSLFHYL 
KNLLWFAPPG LPLAVWTVCR TRLFSTDW GI LGIVWMLAVL VLLAF NPQRF 
QDNLVWLLPP LALFGAAQLD SLRRGAAAFV NWFG IMAFGL FAVFLWTGFF 
AMNYGWPAKL AERAAYFSPY YVPDIDP IPM AVAVLFTPLW LWAI TRKNIR 
GRQAVT NWAA GVTLTWALLM TLFL PWLDAA KSHAPWRSM EASFSPELKR 
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501 ELSDGIECIG IGGGDLHTRI VWTQYGTLPH RVGDVRCRYR IVRLPQNADA 
551 PQGWQTVWQG ARPRNKDSKF ALIRKIGENI LKTTD* 

Further work revealed the following gonococcal DNA sequence [<SEQ ID 601 >] (SEP ID NO: 
' 601) : 



1 ATGCTGACCT ATACCCCGCC CGATGCCCGC CCGCCCGCCA AAACCCACGA 

51 AAAACCGTGG CTGCTGCTGT TGATGGCGTT TGCCTGGCTG TGGCCCGGCG 

101 TGTTTTCCCA CGATTTGTGG AATCCTGCCG AACCTGCCGT CTATACCGCC 

151 GTCGAAGCAC TGGCAGGCAG CCCCACCCCC TTGGTTGCCC ATCTGTTCGG 

201 TCAAACCGAT TTCGGCATAC CGCCCGTGTA TCTTTGGGTT GCCGCCGCAT 

251 TCAAACATTT GCTGTCGCCG TGGGCAGCCG ACCCGTATGA TGCCGCACGC 

301 TTTGCAGGCG TATTTTTTGC CGTTATCGGA CTGACTTCTT GCGGCTTTGC 

351 CGGTTTCAAC TTTTTGGGCA GACACCACGG GCGCAGCGTT GTTTTAATCC 

4 01 ATATCGGCTG TATCGGGCTG ATTCCGGTTG CCCATTTCCT CAATCCcgcc 

4 51 gccgccgcct tTGCCGCCGC CGGACTGGTG CTGCacggct actcgctgGC 

501 ACGCCGGCGC GTGATtgccg cctctTtccT GCTCGGTACG GGTTGGACGT 

551 TGATGTCGCT GGCGGCAGCT TATCCGGCGG CGTTTGCGCT GATGCTGCCC 

601 CTGCCCGTGC TGATGTTTTT CCGTCCGTGG CAAAGCAGGC GTTTGATGTT 

651 GACGGCAGTC GCCTCGCTTG CCTTTGCCCT GCCGCTTATG ACCGTTTACC 

701 CGCTGCTCtt gGCAAAAACG CAGCCCGCGC TGTTTGCGCA ATGGCTCAAC 

751 TATCACGTTT TCGGTACGTt cggcgGCGTG CGGCAcaTTC AGAggGCatT 

801 Cagtttgttt cactatctgA AAaatctgct ttggttcgca ccgcccgggC 

851 TGCCGCTGGC GGTTTGGACG GTTTGCCGCA CACGCCTGTT TTCGACCGAC 

901 TGGGGGATTT TGGGCATTGT CTGGATGCTT GCCGTTTTGG TGCTGCTCGC 

951 CTTTAATCCG CAGCGTTTTC AAGACAACCT CGTCTGGCTG CTGCCGCCGC 

1001 TTGCCCTGTT CGGCGCGGCG CAACTGGACA GCCTGAGGCG CGGCGCGGCG 

1051 GCTTTTGTCA ACTGGTTCGG CATTATGGCG TTCGGGCTGT TTGCCGTGTT 

1 1101 CCTGTGGACG GGCTTTTTCG CCATGAATTA CGGCTGGCCC GCCAAGCTTG 

1151 CCGAACGCGC CGCCTACTTC AGCCCGTATT ACGTTCCCGA CATCGATCCC 

1201 ATTCCGATGG CGGTTGCCGT ACTGTTCACA CCCTTGTGGC TGTGGGCGAT 

1251 TACCCGGAAA AACATACGCG GCAGGCAGGC GGTTACCAAC TGGGCGGCAG 

1301 GCGTTACCCT GACCTGGGCT TTGCTGATGA CGCTGTTCCT GCCGTGGCTG 

1351 GACGCGGCGA AAAGCCACGC GCCCGTCGTC CGGAGTATGG AGGCATCGTT 

14 01 TTCCCCGGAA TTAAAACGGG AGCTTTCAGA CGGCATCGAG TGTATCGGCA 

14 51 TAGGCGGCGG CGACCTGCAC ACGCGGATTG TTTGGACGCA GTACGGCACA 

1501 TTGCCGCACC GCGTCGGCGA TGTCCGTTGC CGCTACCGTA TCGTCCGCCT 

1551 GCCCCAAAAC GCGGATGCGC CGCAAGGCTG GCAGACGGTC TGGCAGGGTG 

1601 CGCGCCCGCG CAACAAAGAC AGTAAGTTTG CACTGATACG GAAAATCGGG 

1651 GAAAATATAT TAAAAACAAC AGATTGA 

This corresponds to the amino acid sequence [<SEQ ID 602; ORF141ng-l>] (SEP ID NO: 602; 
ORF141ng-l) : 



1 MLTYTPPDAR PPAKTHEKPW LLLLMAFAWL WPGVFSHDLW NPAEPAVYTA 

51 VEALAGSPTP LVAHLFGQTD FGIPPVYLWV AAAFKHLLSP WAADPYDAAR 

101 FAGVFFAVIG LTSCGFAGFN FLGRHHGRSV VLIHIGCIGL I PVAHFLNPA 

151 AAAFAAAGLV LHGYSLARRR VIAASFLLGT GWTLMSLA AA YPAAFALMLP 

201 LPVLMFF RPW QSRRL MLTAV ASLAFALPLM TV YPLLLAKT QPALFAQWLN 

251 YHVFGTFGGV RHIQRAFSLF HYLKNLLWFA PPGLPLAVWT VCRTRLFSTD 

301 W GILGIVWML AVLVLLAFN P QRFQDNLVWL LPPLALFGAA QLDSLRRGAA 

351 AFVNWFG IMA FGLFAVFLWT GFFAM NYGWP AKLAERAAYF SPYYVPDIDP 

401 IPMAVAVLFT PLWLWAITRK NIRGRQAVTN WAAGVTLTWA LLMTLFLPWL 

451 DAAKSHAPW RSMEASFSPE LKRELSDGIE CIGIGGGDLH TRIVWTQYGT 

501 LPHRVGDVRC RYRIVRLPQN ADAPQGWQTV WQGARPRNKD SKFALIRKIG 

551 EN I LKTTD* 
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ORF141ng-l (SEP ID NO: 602) and ORF141-1 (SEP ID NO: 596) show 97.5% identity in 553 aa 
overlap: 

or f 14 lng- 1 . pep MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPAEPAVYTAVEALAGSPTP 

1 1 1 1 II M 1 1 1 1 1 1 M 1 1 II II 1 1 1 1 1 1 1 1 1 Mi 1 1 1 1 1 lllllllll llllll 

5 or f 141-1 MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPDEPAVYTAVEALAGSPTP 

orf 14 lng- 1. pep LVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAADPYDAARFAGVFFAVIGLTSCGFAGFN 

llllll IIIMMIIIIIIIIIIIMII III 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 'I M 1 1 1 1 

orf 14 1-1 LVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAADSYDAARFAGVFFAVIGLTSCGFAGFN 
orf 14 lng- 1 .pep FLGRHHGRS WL I H I GC I GL I P VAHFLNP AAAAFAAAGLVLHG YS LARRRV I AAS FLLGT 

10 1 1 1 1 I M 1 1 1 1 1 II I II II 1 1 1 II I M ! 1 1 1 1 1 1 1 1 1 II I M I MM Ml M I M 1 1 i 

orf 141-1 FLGRHHGRS WL I L I GC I GL I P VAHFLNP AAAAFAAAGLVLHG YS LARRRV I AAS FLLGT 

orf 141ng-l .pep GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT 
I I I I I I I I II I I II I I I I I I I I I I I I I M I I I I I I M I I I I I M M I I I I I M I I M 
orf 141-1 GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT 

1 5 orf 14 lng- 1 . pep QPALFAQWLNYHVFGTFGGVRHIQRAFSLFHYLKNLLWFAPPGLPLAVWTVCRTRLFSTD 

1 1 1 1 1 1 1 1 1 = 1 1 1 1 1 1 1 1 1 1 1 1 = I 11111 = 111111111 M 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 

or f 14 1 - 1 QPALFAQWLDYHVFGTFGGVRHVQTAFSLFYYLKNLLWFALPALPLAVWTVCRTRLFSTD 

orf 14 lng- 1 .pep WGILGIVWMI^VLVLLAFNPQRFQDNLWLLPPLALFGAAQLDSLRRGAAAFVNWFGIMA 

: I I I M I I I I I I I M I I t I I I I [ I I I I I I I I 1 I I I I I I I I I I t 

20 orfl41-l WGILGWWMLAVLVLLAVNPQRFQDNLVWLLPPLALFGAAQLDSLRRGAAAFVNWFGIMA 

orf 141ng- 1 . pep FGLFAVFLWTGFFAMNYGWPAKLAERAAYFS P YYVPD I DP I PMAVAVLFTPLWLWAI TRK 

III llllll IIIIMIIIIMIIMI1II IIIIIIIIMIMIIMIMIIIIMI III 

orf 14 1-1 FGLFAVFLWTGFFAMNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK 

orf 14 lng- 1 . pep NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPVVRSMEASFSPELKRELSDGIE 

25 I I I I I I I 1 I I I I I E 1 I I I I It I f I I I I I I I I I I I I I I t I I I I I I I I : I I I E I I I I I t I I I 

orf 141-1 NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPWRSMEASLSPELKRELSDGIE 

orf 14 lng- 1 .pep C I G IGGGDLHTR I VWTQYGTLPHRVGDVRCRYR I VRLPQNADAPQGWQTVWQGARPRNKD 

i ! I MINI ! 1 1 1 1 1 1 1 1 1 ! : 1 1 1 1 1 1 MMIMMIMM IMIIIII 

orf 14 1- 1 CIGIGGGDLHTRIVWTQYGTLPHRVGDVQCRYRIVLLPQNADAPQGWQTVWQGARPRNKD 
30 orf 14 lng- 1. pep S KFAL I RKI GEN I LKTTDX 

1 1 1 1 1 1 1 i 1 1 1 1 1 

orf 14 1-1 S KFAL I RKI GEN IX 

Based on the presence of several putative transmembrane domains in the gonococcal protein, it is 
35 predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 72 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 603>] (SEP ID 
NO: 603) : 
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1 . . CAATCCGCCA AATGGTTATC 

51 GATACGCGGG CAGATAAAGC 

101 CCGGCCGCGC ATTGAAAAAG 

151 AGCGGTTTTC AGGTAGGCTA 



GGGCCAAACT CTAGTCGGCA' CAGCAATTGG 
TTGGCGGCAA CCTGCATTAC GATATATTTA 
CCCGAATTTT TCCAATCAAG GAAATGGGCA 
TACGTTTTAA 



This corresponds to the amino acid sequence [<SEQ ID 604; ORF142>] (SEP ID NO: 604; 
PRF142) : 



1 . . QSAKWLSGQT LVGTAIGIRG QIKLGGNLHY DIFTGRALKK PEFFQSRKWA 
51 . SGFQVGYTF* 

Further work revealed the complete nucleotide sequence [<SEQ ID 605>] (SEP ID NO: 605) : 



1 ATGGATAATT CGGGTAGTGA GGCGACAGGA AAATACCAAG GAAATATCAC 

51 TTTCTCTGCC GACAATCCTT TGGGACTGAG TGATATGTTC TATGTAAATT 

101 ATGGACGTTC GATTGGCGGT ACGCCCGATG AGGAAAGTTT TGACGGCCAT 

151 CGCAAAGAAG GCGGATCAAA CAATTACGCC GTACATTATT CAGCCCCTTT 

201 CGGTAAATGG ACATGGGCAT TCAATCACAA TGGCTACCGT TACCATCAGG 

251 CAGTTTCCGG ATTATCGGAA GTCTATGACT ATAATGGAAA AAGTTACAAT 

301 ACTGATTTCG GCTTCAACCG CCTGTTGTAT CGTGATGCCA AACGCAAAAC 

351 CTATCTCGGT GTAAAACTGT GGATGAGGGA AACAAAAAGT TACATTGATG 

401 ATGCCGAACT GACTGTACAA CGGCGTAAAA CTGCGGGTTG GTTGGCAGAA 

451 CTTTCCCACA AAGAATATAT CGGTCGCAGT ACGGCAGATT TTAAGTTGAA 

501 ATATAAACGC GGCACCGGCA TGAAAGATGC TCTGCGCGCG CCTGAAGAAG 

551 CCTTTGGCGA AGGCACGTCA CGTATGAAAA TTTGGACGGC ATCGGCTGAT 

601 GTAAATACTC CTTTTCAAAT CGGTAAACAG CTATTTGCCT ATGACACATC 

651 CGTTCATGCA CAATGGAACA AAACCCCGCT AACATCGCAA GACAAACTGG 

701 CTATCGGCGG ACACCACACC GTACGTGGCT TCGACGGTGA AATGAGTTTG 

751 TCTGCCGAGC GGGGATGGTA TTGGCGCAAC GATTTGAGCT GGCAATTTAA 

801 ACCAGGCCAT CAGCTTTATC TTGGGGCTGA TGTAGGACAT GTTTCAGGAC 

851 AATCCGCCAA ATGGTTATCG GGCCAAACTC TAGTCGGCAC AGCAATTGGG 

901 ATACGCGGGC AGATAAAGCT TGGCGGCAAC CTGCATTACG ATATATTTAC 

951 CGGCCGCGCA TTGAAAAAGC CCGAATTTTT CCAATCAAGG AAATGGGCAA 

1001 GCGGTTTTCA GGTAGGCTAT ACGTTTTAA 



This corresponds to the amino acid sequence [<SEQ ID 606; ORF142-l>] (SEP ID NO: 606; 
ORF142-1) : 



1 . MDNSGSEATG KYQGNITFSA DNPLGLSDMF YVNYGRSIGG TPDEESFDGH 

51 RKEGGSNNYA VHYSAPFGKW TWAFNHNGYR YHQAVSGLSE VYDYNGKSYN 

101 TDFGFNRLLY RDAKRKTYLG VKLWMRETKS YIDDAELTVQ RRKTAGWLAE 

151 LSHKEYIGRS TADFKLKYKR GTGMKDALRA PEEAFGEGTS RMKIWTASAD 

201 VNTPFQIGKQ LFAYDTSVHA QWNKTPLTSQ DKLAIGGHHT VRGFDGEMSL 

251 SAERGWYWRN DLSWQFKPGH QLYLGADVGH VSGQSAKWLS GQTLVGTAIG 

301 IRGQIKLGGN LH YD I FTGRA LKKPEFFQSR KWASGFQVG Y TF * 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N. gonorrhoeae 



PRF142 (SEP ID NP: 604) shows 88.1% identity over a 59aa overlap with a predicted PRF 
(PRF142ng) (SEP ID NP: 608) from N. gonorrhoeae: 
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orf 142 .pep QS AKWLSGQTLVGTAI G I RGQ I KLGGNLHY 30 

I I I I I I I i ! M I I I I I I I T I I I i I II I 
orf 142ng RGWYWRNDLSWQFKPGHQLYLGADVGHVSGQSAKWLSGQTLAGTAIG I RGQ I KLGGNLHY 313 

orf 142 .pep DI FTGRALKKPEFFQSRKWASGFQVGYTF 59 

II lllllllhM-lh:|lll hi 
orfl42ng DI FTGRALECKPEYFQTKKWVTGFQVGYSF 342 

The complete length ORF142ng nucleotide sequence [<SEQ ID 607>] (SEP ID NO: 607) is: 

1 ATGGATAATT CGGGTAGTGA GGCGACAGGA AAATACCAAG GAAATATCAC 

51 TTTCTCTGCC GACAATCCTT TTGGACTGAG TGATATGTTC TATGTAAATT 

101 ATGGACGTTC AATTGGCGGT ACGCCCGATG AGGAAAATTT TGACGGCCAT 

151 CGCAAAGAAG GCGGATCAAA CAATTACGCC GTACATTATT CAGCCCCTTT 

201 CGGTAAATGG ACATGGGCAT TCAATCACAA TGGCTACCGT TACCATCAGG 

251 CGGTTTCCGG ATTATCGGAA GTCTATGACT ATAATGGAAA AAGTTACAAC 

301 ACTGATTTCG GCTTCAACCG CCTGTTGTAT CGTGATGCCA AACGCAAAAC 

351 CTATCTCAGT GTAAAACTGT GGACGAGGGA AACAAAAAGT TACATTGATG 

4 01 ATGCCGAACT GACTGTACAA CGGCGTAAAA CCACAGGTTG GTTGGCAGAA 

4 51 CTTTCCCACA AAGGATATAT CGGTCGCAGT ACGGCAGATT TTAAGTTGAA 

501 ATATAAACAC GGCACCGGCA TGAAAGATGC TCTGCGCGCG CCTGAAGAAG 

551 CCTTTGGCGA AGGCACGTCA CGTATGAAAA TTTGGACGGC ATCGGCTGAT 

601 GTAAATACTC CTTTTCAAAT CGGTAAACAG CTATTTGCCT ATGACACATC 

651 CGTTCATGCA CAATGGAACA AAACCCCGCT AACATCGCAA GACAAACTGG 

701 CTATCGGCGG ACACCACACC GTACGTGGCT TCGACGGTGA AATGAGTTTG 

751 CCTGCCGAGC GGGGATGGTA TTGGCGCAAC GATTTGAGCT GGCAATTTAA 

801 ACCAGGCCAT CAGCTTTATC TTGGGGCTGA TGTAGGACAT GTTTCAGGAC 

851 AATCCGCCAA ATGGTTATCG GGCCAAACTC TAGCCGGCAC AGCAATTGGG 

901 ATACGCGGGC AGATAAAGCT TGGCGGCAAC CTGCATTACG ATATATTTAC 

951 CGGCCGTGCA TTGAAAAAGC CCGAATATTT TCAGACGAAG AAATGGGTAA 

1001 CGGGGTTTCA GGTGGGTTAT TCGTTTTGA 

This encodes a protein having amino acid sequence [<SEQ ID 608>] (SEP ID NO: 608) : 

1 MDNSGSEATG KYQGNITFSA DNPFGLSDMF YVNYGRSIGG TPDEENFDGH 

51 RKEGGSNNYA VHYSAPFGKW TWAFNHNGYR YHQAVSGLSE VYDYNGKSYN 

101 TDFGFNRLLY RDAKRKTYLS VKLWTRETKS YIDDAELTVQ RRKTTGWLAE 

151 LSHKGYIGRS TADFKLKYKH GTGMKDALRA PEEAFGEGTS RMKIWTASAD 

201 VNTPFQIGKQ LFAYDTSVHA QWNKTPLTSQ DKLAIGGHHT VRGFDGEMSL 

251 PAERGWYWRN DLSWQFKPGH QLYLGADVGH VSGQSAKWLS GQTLAGTAIG 

301 IRGQIKLGGN LHYD I FTGRA LKKPEYFQTK KWVTGFQVG Y SF * 

The underlined sequence (aromatic-Xaa-aromatic amino acid motif) is usually found at the 
C-terminal end of outer membrane proteins. 



ORF142ng (SEP ID NO: 608) and ORF142-1 (SEP ID NO: 606) show 95.6% identity over 342aa 
overlap: 

orf 14 2 - 1 . pep MDNSGSEATGKYQGNITFSADNPLGLSDMFYVNYGRSIGGTPDEESFDGHRKEGGSNNYA 

! 1 1 1 1 M I M 1 1 1 1 M 1 : 1 1 1 11 M I M 1 1 i 1 1 1 11 1 M ^ 1 1 1 1 1 1 1 M 

or f 14 2ng- 1 MDNSGSEATGKYQGNITFSADNPFGLSDMFYVNYGRS IGGTPDEENFDGHRKEGGSNNYA 
orf 142-1 .pep VHYSAPFGKWTWAFNHNGYRYHQAVSGLSEVYDYNGKSYNTDFGFNRLLYRDAKRKTYLG 

MM MM IIIIMI IMIMIMN MINIM III MINIMUM lllllll MM 
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or f 14 2 ng- 1 VHYSAPFGKWTWAFNHNGYRYHQAVSGLSEVYDYNGKSYNTDFGFNRLLYRDAKRKTYLS 



orf 142-1 .pep VKLWMRETKSYIDDAELTVQRRKTAGWLAELSHKEYIGRSTADFKLKYKRGTGMKDALRA 

1 1 1 1 III IlilllMMMM :|IIMMM II 1 1 1 1 1 1 1 1 II 1 1 : 1 1 1 1 1 1 1 M I 

orf 142ng-l VKLWTRETKSYIDDAELTVQRRKTTGWLAELSHKGYIGRSTADFKX.KYKHGTGMKDALRA 

5 orf 142 - 1 . pep PEEAFGEGTSRMKIWTASADVNTPFQIGKQLFAYDTSVHAQWNKTPLTSQDKLAIGGHHT 

I I I I I I I ' I I I I I II I I I M I I I I M I I I I I I I II I I I I I I M I I I I I I I II I M 
orf 142ng-l PEEAFGEGTSRMKIWTASADVNTPFQIGKQLFAYDTSVHAQWNKTPLTSQDKLAIGGHHT 

orf 142-1. pep VRGFDGEMSLSAERGWYWRNDLSWQFKPGHQLYLGADVGHVSGQSAKWLSGQTLVGTAIG 
. IIIIIIIMI I I I I I I I I I I I II I : I I I I I I I I I I I I I I M I I I I I I I I I I h I I M 

10 orf 142ng-l WGFDGEMSLPAERGWYWRNDLSWQFKPGHQLYLGADVGHVSGQSAKWLSGQTLAGTAIG 



orf 142 - 1 . pep I RGQ I KLGGNLHYD I FTGRALKKP E F FQS RKWASG FQVG YT F 

I I I I I I I : I I I I I I M I I I I I I I hi I- I I- M I II h 
orfl42ng-l I RGQ I KLGGNLHYD I FTGRALKKPE Y FQTKKWVTGFQVGYS F 

1 5 In addition, ORF142ng (SEP ID NO: 608) is homologous to the HecB protein (SEP ID NO: 1149) 
of Exhrysanthemt 



gi | 1772622 (L39897) HecB [Erwinia chrysanthemi] Length = 558 
Score = 119 bits (295), Expect = 3e-26 

Identities = 88/346 (25%), Positives = 151/346 (43%), Gaps = 22/346 (6%) 



20 Query: 2 DNSGSEATGKYQGNITFSADNPFGLSDMFYVNYGRSIGGTPDEENFDGHRKEGGSNNYAV 61 

DNSG ++TG+ Q N + + DN FGL+D ++++ G S + + D + G 
Sbjct: 230 DNSGQKSTGEEQLNGSLALDNVFGLADQWFISAGHS SRFATSHDAESLQAG 280 

Query: 62 HYSAPFGKWTWAFNHNGYRYHQAVSGLSEVYDYNGKSYNTDFGFNRLLYRDAKRKTYLSV 121 
+S P+G W +N+ + RY + G S F +R+++RD KT ++ 

25 Sbjct: 281 -FSMPYGYWNLGYNYSQSRYRNTFINRDFPWHSTGDSDTHRFSLSRWFRDGTMKTAIAG 339 

Query: 122 KLWTRETKSYIDDAELTVQRRKTTGWLAELSHKGYIGRSTADFKLKYKHGTGMKDALRAP 181 

R +Y++ + L RK + ++H + A F Y G + 

Sbjct: 340 TFSQRTGNNYLNGSLLPSSSRKLSSVSLGVNHSQKLWGGLATFNPTYNRGVRWLGSETDT 399 

Query: 182 EEAFGEGTSRMKIWTASADVNTPFQIGKQLFAYDTSVHAQWNKTPLTSQDKLAIGGHHTV 241 
30 + + + E + WT SA P Y S + + Q++ L ++L +GG + + 

Sbjct: 400 DKSADEPRAEFNKWTLSASYYHPV TDSITYLGSLYGQYSARALYGSEQLTLGGESSI 456 

Query: 242 RGFDGEMSLPAERGWYWRNDLSWQFKP GHQLYLGA- DVGHVSGQSAKWLSGQTLAG 296 

RGF E RG YWRN+L+WQ G+ ++ A D GH+ + +L G 

Sbjct: 457 RGF - REQYTSGNRGAYWRNELNWQAWQLPVLGNVTFMAAVDGGHLYNHKQDNSTAASLWG 515 

35 Query: 297 TAI G I RGQ I KLGGNLHYD I FTGRALKKPE YFQTKKWVTGFQVGYS F 342 

A+G+ . + L +G+P+Q V G++VG SF 
Sbjct: 516 GAVGMTVASRW LS QQ VT VGWP I S Y P AWLQ PDTM WG YRVGLS F 558 

On the basis of this analysis, it is predicted that the proteins from N. meningitidis and 
40 N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 



raising antibodies. 
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Example 73 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 609>] (SEP ID 
NO: 609) : 



1 ATGCGGACGA AATGGTCAGC AGTGAGAAGC TGCTTACTTG GgCGGACACC 

51 GCCGACATCG ATACCGCTTT GAACCTGTTG TACCGTTTGC AAAAACTCGA 

101 ATTCCTCTAT GGCGATGAAA ACGGTCATTC AGACGGCATC AATTTGwCGG 

151 ACGAGCAATT GCCGTTGCTG ATGGAACAAT TGTCCGGCAG CGGTAAGGCG 

201 TTATTGGTCG ATCGGAACGG TCTGTATCTT GCCAACGCCA ATTTCCATCA 

251 TGAGGCGGCG GAAGAGTTGG GGTTGTTGGC GGCAGAAGTC GCACAGATGG 

301 AAAAGAAATA CCGGCTGCTG ATTAAGAACA AC. . 

This corresponds to the amino acid sequence [<SEQ ID 143>] (SEP ID NO: 610; ORF143) : 

1 MRTKWSAVRS CTWADTADID TALNLLYRLQ KLEFLYGDEN GHSDGINLXD 
51 EQLPLLMEQL SGSGKALLVD RNGLYIiANAN FHHEAAEELG LLAAEVAQME 
101 KKYRLLIKNN . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 61 1>] (SEP ID NP: 61 1) : 



1 ATGGAATCAA CACTTTCACT ACAAGCAAAT TTATATCCCC GCCTGACTCC 

51 TGCCGGTGCA TTTTATGCCG TATCCAGCGA TGCCCCCAGT GCCGGTAAAA 

.101 CTTTGTTGCA CAGCCTGTTG AAAGCAGATG CGGACGAAAT GGTCAGCAGT 

151 GAGAAGCTGC TTACTTGGGC GGACACCGCC GACATCGATA CCGCTTTGAA 

201 CCTGTTGTAC CGTTTGCAAA AACTCGAATT CCTCTATGGC GATGAAAACG 

251 GTCATTCAGA CGGCATCAAT TTGTCGGACG AGCAATTGCC GTTGCTGATG 

301 GAACAATTGT CCGGCAGCGG TAAGGCGTTA TTGGTCGATC GGAACGGTCT 

351 GTATCTTGCC AACGCCAATT TCCATCATGA GGCGGCGGAA GAGTTGGGGT 

401 TGTTGGCGGC AGAAGTCGCA CAGATGGAAA AGAAATACCG GCTGCTGATT 

451 AAGAACAACC TGTATATCAA CAATAACGCT TGGGGCGTTT GCGATCCTTC . 

501 CGGTCAGAGC GAATTGACAT TTTTCCCATT GTATATCGGT TCAACCAAAT 

551 TTATTTTGGT TATCGGCGGC ATTCCCGATT TGGGCAAAGA GGCATTTGTT 

601 ACTTTGGTAA GGATTTTATA CCGCCGTTAC AGCAACCGCG TGTAA 

This corresponds to the amino acid sequence [<SEQ ID 612; PRF143-1>] (SEP ID NP: 612; 
PRF143-1) : 



1 MESTLSLQAN LYPRLTPAGA FYAVSSDAPS AGKTLLHSLL KADADEMVSS 

51 EKLLTWADTA DIDTALNLLY RLQKLEFLYG DENGHSDGIN LSDEQLPLLM 

101 EQLSGSGKAL LVDRNGLYLA NANFHHEAAE ELGLLAAEVA QMEKKYRLLI 

151 KNNLYINNNA WGVCDPSGQS ELT FFPLYIG STKFILVIGG I PDLGKEAFV 

201 TLVRILYRRY SNRV* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N. meningitidis (strain A) 
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ORF143 (SEP ID NO: 610) shows 92.4% identity over a 105aa overlap with an ORF (ORF143a) 
(SEP ID NO: 614) from strain A of N. meningitidis: 

10 20 30 

orf 143 .pep MRTKWSAVRSCTWADTADIDTALNLLYRLQKLEFL 

I : Ml llllllllllllllllllll 
orf 14 3a GAFYAVSSDXPSAGKTLLHSLLKADADEMVSSEKLLTWAXTADIDTALNLLYRLQKLEFL 
20 30 40 50 60 70 

40 50 60 70 80 90 

orf 143 . pep YGDENGHSDGINLXDEQLPLLMEQLSGSGKALLVDRNGLYLANANFHHEAAEELGLLAAE 

Illllllllllll llllllll IMIIIMIIIIIIMI llllllll lllllll 

orf 143a YGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLANANFHHEAAEELGLLAAE 
80 90 100 110 120 130. 



100 110 
orf 143 . pep VAQMEKKYRLLI KNN 
llllllll | | M 

orf 14 3a VAQMEKKYRLXIKNNLYINNNAWGVCDPSGQSELT FFPLYIGSTKFILVIGG IPDLGKEA 
140 150 160 170 180 190 



The complete length ORF143a nucleotide sequence [<SEQ ID 61 3>] (SEPIDNP: 613) is: 



1 ATGGAATCAA CANTTTCACT ACAAGCAAAT TTATATCNCC GCCTGACTCC 

51 TGCCGGTGCA TTTTATGCCG TATCCAGCGA TGNCCCCAGT GCCGGTAAAA 

101 CTTTGTTGCA CAGCCTGTTG AAAGCGGATG CGGACGAAAT GGTNAGCAGT 

151 GAGAAGCTGC TTACCTGGGC GGANACCGCC GACATCGATA CCGCTTTGAA 

201 CCTGTTGTAC CGTTTGCAAA AACTCGAATT CCTCTATGGC GATGAAAACG 

251 GTCATTCAGA CGGCATCAAT TTGTCGGACG AGCAATTGCC GTTGCTGATG 

301 GAACAATTGT CCGGCAGCGG TAAGGCGTTA TTGGTCGATC GGAACGGTCT 

351 GTATCTTGCC AACGCCAATT TCCATCATGA GGCGGCGGAA GAGTTGGGGT 

401 TGTTGGCGGC AGAAGTCGCA CAGATGGAAA AGAAATACCG GCTGCNNATT 

451 AAGAACAACC TGTATATCAA CAATAACGCT TGGGGCGTTT GCGATCCTTC 

501 CGGTCAGAGC GAATTGACAT TTTTCCCATT GTATATCGGT TCAACCAAAT 

551 TTATTTTGGT TATCGGCGGC ATTCCCGATT TGGGCAAAGA GGCATTTGTT 

601 ACTTTGGTAA GGATNTTATA CCNCCNGTTA CAGCAACCGC GTGTAAAACT 

651 TGGGAGAGAG GANGGGTTAT GCAGCAATTA TTGA 



This encodes a protein having amino acid sequence [<SEQ ID 614>] (SEP ID NP: 614) : 



1 MESTXSLQAN LYXRLTPAGA FYAVSSDXPS AGKTLLHSLL KADADEMVSS 

51 EKLLTWAXTA DIDTALNLLY RLQKLEFLYG DENGHSDGIN LSDEQLPLLM 

101 EQLSGSGKAL LVDRNGLYLA NANFHHEAAE ELGLLAAEVA QMEKKYRLX I 

151 KNNLYINNNA WGVCDPSGQS ELT FFPLYIG STKFILVIGG I PDLGKEAFV 

201 TLVRXLYXXL QQPRVKLGRE XGLCSNY* 

PRF143a (SEP ID NP: 614) and PRF143-1 (SEP ID NP: 612) show 97.1% identity in 207 aa 
overlap: 



orf 14 3a. pep MESTXSLQANLYXRLTPAGAFYAVSSDXPSAGKTLLHSLLKADADEMVSSEKLLTWAXTA 

I I I I lllllll llllllllllllll I I I I I I I I I I I I I I I I M I I I I I I I I I II 
orf 143 - 1 MESTLSLQANLYPRLTPAGAFYAVSSDAPSAGKTLLHSLLKADADEMVSSEKLLTWADTA 

orf 14 3a. pep DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA 
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1 1 1 1 1 M I ! M I M 1 1 1 1 i 1 1 1 1 1 1 1 1 1 ! I II 1 1 1 1 1 1 M 1 1 ! 

orf 143 - 1 DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA 
orf 14 3a. pep NANFHHEAAEELGLLAAEVAQMEKKYRLXIKNNLYINNNAWGVCDPSGQSELTFFPLYIG 

IIIIIIIIIIIIIIIIIIIIIIIIMII II I II II II II MIMI MM II II II MM 

orf 143-1 NANFHHEAAEELGLLAAEVAQMEKKYRLLIKNNLYINNNAWGVCDPSGQSELTFFPLYIG 

orf 143a. pep STKFILVIGGI PDLGKEAFVTLVRXLY 

I I I I 1 I I I I I ■ I II 
orf 143-1 STKFILVIGGI PDLGKEAFVTLVRILY 

Homology with a predicted ORF from N. gonorrhoeae 

ORF143 (SEP ID NO: 610) shows 95.5% identity over a HOaa overlap with a predicted ORF 
(ORF143ng) (SEP ID NO: 616) from N. gonorrhoeae: 



orf 143 . pep MRTKWSAVRSCTWADTADIDTALNLLYRLQKLEFLYGDENGHSDGINLXDEQLPLLMEQL 60 

Illlllllllh J 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IMIIIMIII 

orf 143ng MRTKWSAVRSCSRADTADIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQL 60 

orf 143 .pep SGSGKALLVDRNGLYLANANFHHEAAEELGLLAAEVAQMEKKYRLLIKNN 110 

Mill Mill I 1 1 1 1 1 1 1 1 II I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II I M 

orf 14 3ng SGSGKALLVDRNGLYLANANFHHESAEELGLLAAEVAQMEKKYRLLIRNNLYINNNAWGV 12 0 

An PRF143ng nucleotide sequence [<SEQ ID 61 5>] (SEP ID NP: 615) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 61 6>] (SEP ID NP: 616) : 



1 MRTKWSAVRS CSRADTADID TALNLLYRLQ KLEFLYGDEN GHSDGINLSD 

51 EQLPLLMEQL SGSGKALLVD RNGLYLANAN FHHESAEELG LLAAEVAQME 

101 KKYRLLIRNN LYINNNAWGV CDPSGQSELT F FPLYIGSTK FILVIAGI PD 

151 LSKGGICYFG KDFIPPLQQP RVKLGTGGIM RQLLISILED LNNTSTDIIA 

201 SAVISTDGLP MATMLPSHLN SDRVGAISAT LLALGSRSVQ ELACGELEQV 

251 MIKGKSGYIL LSQAGKDAVL VLVAKETG RL GLILLDAKRA ARHIA EAI* 

Further work revealed the following gonococcal DNA sequence [<SEQ ID 617>] (SEP ID NP: 
617): 



1 ATGGAATCAA CACTTTCACT ACAAGCGAAT TTATATCCCT GCCTGACTCC 

51 TGCCGGTGCA TTTTATGCCG TATCCAGCGA TGCCCCCAGT GCCGGTAAAA 

101 CTTTGTTGCG CAGCCTGTTG AAAGCGGATG CGGACGAAGT GGTCAGCAGT 

151 GAGAAGCTGC TCGCGGCGGA CACCGCCGAC ATCGATACCG CTTTGAACCT 

201 GTTGTACCGT TTGCAAAAAC TCGAATTCCT CTATGGCGAT GAAAACGGTC 

251 ATTCAGACGG CATCAATTTG TCGGACGAGC AATTGCCGTT GCTGATGGAA 

3 01 CAATTGTCCG GCAGCGGTAA GGCATTATTG GTCGATCGGA ACGGTCTGTA 
351 TCTTGCCAAC GCCAATTTCC ATCATGAGTC GGCGGAAGAG TTGGGGTTGT 

4 01 TGGCGGCAGA AGTCGCACAG ATGGAAAAGA AATACCGGCT GCTGATTAGG 
4 51 AACAACCTGT ATATCAACAA TAACGCTTGG GGCGTTTGCG ATCCTTCCGG 
501 TCAGAGCGAA TTGACATTTT TCCCATTGTA TATCGGTTCA ACCAAATTTA 
551 TTTTGGTTAT CGCCGGCATT CCCGATTTGA GCAAAGAGGC ATTTGTTACT 
601 TTGGTAAGGA TTTTATACCG CCGTTACAGC AACCGCGTGT AA 
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This corresponds to the amino acid sequence [<SEQ ID 618; ORF143ng-l>] fSEO ID NO: 618; 
PRF143ng-l) : 

1 MESTLSLQAN LYPCLTPAGA FYAVSSDAPS AGKTLLRSLL KADADEWSS 
51 EKLLAADTAD IDTALNLLYR LQKLEFLYGD ENGHSDGINL SDEQLPLLME 
5 101 QLSGSGKALL VDRNGLYLAN ANFHHESAEE LGLLAAEVAQ MEKKYRLLIR 

151 NNLYINNNAW GVCDPSGQSE LTF FPLYIGS TKFILVIAGI PDLSKEAFVT 
201 LVRILYRRYS NRV* 

PRF143ng-l (SEP ID NO: 618) and ORF143-1 (SEP ID NP: 612) show 95.8% identity in 214 aa 
10 overlap: 

orf 143ng-l .pep MESTLSLQANLYPCLTPAGAFYAVSSDAPSAGKTLLRSLLKADADEWSSEKLLA- ADTA 59 

II I II II II I II I ill III I II II II I II INI I hi II I II II hi II II Ih 1 1 1 1 

orf 143 - 1 MESTLSLQANLYPRLTPAGAFYAVSSDAPSAGKTLLHSLLKADADEMVSSEKLLTWADTA 60 

orf 143ng-l .pep DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA 119 

15 1 1 1 1 1 1 1 1 1 . 1 1 1 1 1 1 1 II 1 1 i 1 1 1 1 1 1 1 1 1 1 1 II 1 1 Ml 1 1 M M 1 1 1 1 1 1 1 1 1 M 

orf 143 - 1 DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA 120 

orf 143ng-l .pep NANFHHESAEELGLLAAEVAQMEKKYRLLIRNNLYINNNAWGVCDPSGQSELTFFPLYIG 179 

1 1 1 1 1 1 :| M 1 1 1 1 II II 1 1 1 M 1 1 1 ! I h 1 1 M I : I I I I I I I I I I I I I I M I I I M 

orf 143 - 1 NANFHHEAAEELGLLAAEVAQMEKKYRLLIKNNLYINNNAWGVCDPSGQSELTFFPLYIG 180 

20 orfl43ng-l.pep STKF I LV I AG I PDLS KEAFVTLVR I LYRRYSNRV 213 

MINIM:. IMMII MINIM Mill 

orf 143 - 1 STKF I LVIGG I PDLGKEAFVTLVRI LYRRYSNRV 214 

Based on the presence of the putative transmembrane domains in the gonococcal protein, it is 
25 predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 74 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 619>] (SEP ID 
NO: 619) : 

30 1 ATGACCTTTT TACAACGTTT GCAAGGTTTG GCAGACAATA AAATCTGTGC 

51 GTTTGCATGG TTCGTCGTCC GCCGCTTTGA TGAAGAACGC GTACCGCAGr 

101 CGGCGGCAAG CATGACGTTT ACGACGCTGC TGGCACTCGT CCCCGTGCTG 

151 ACCGTGATGG TGGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGCTGGTC 

201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CA.GGCGCGG 

35 251 ACATGGTGTT CGACTATATC AATGCGTTCC GCGAGCAGGC GAACCGGCTG 

3 01 ACGGCAATCG GCAGCGTGAT GCTGGTCGTT ACCTCGCTGA TGCTGATTCG 
351 GACGATAGAC AATACGTTCA ACCGCATCTG GjiCGGGTCAA wTyCCAGCGT 

4 01 CCGTGGATG. . 
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This corresponds to the amino acid sequence [<SEQ ID 620; ORF144>] fSEO ID NO: 620; 
PRF144) : 

1 MTFLQRLQGL ADNKI CAFAW FWRRFDEER VPQXAASMTF TTLLALVPVL 

51 TVMVAVASIF PVFDRWSDSF VSFVNQTIVP XGADMVFDYI NAFREQANRL 

5 101 TAIGSVMLW TSLMLIRTID NTFNRIWRVX XQRPWM... 

Further work revealed the complete nucleotide sequence [<SEQ ID 62 1>] (SEP ID NO: 621) : 

1 ATGACCTTTT TACAACGTTT GCAAGGTTTG GCAGACAATA AAATCTGTGC 

51 GTTTGCATGG TTCGTCGTCC GCCGCTTTGA TGAAGAACGC GTACCGCAGG 

10 101 CGGCGGCAAG CATGACGTTT ACGACGCTGC TGGCACTCGT CCCCGTGCTG 

151 ACCGTGATGG TGGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGCTGGTC 

201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CAGGGCGCGG 

251 ACATGGTGTT CGACTATATC AATGCGTTCC GCGAGCAGGC GAACCGGCTG 

301 ACGGCAATCG GCAGCGTGAT GCTGGTCGTT ACCTCGCTGA TGCTGATTCG 

15 * 351 GACGATAGAC AATACGTTCA ACCGCATCTG GCGGGTCAAT TCCCAGCGTC 

4 01 CGTGGATGAT GCAGTTTCTC GTCTATTGGG CTTTACTGAC GTTCGGGCCG 

451 CTGTCTTTGG GCGTGGGCAT TTCCTTTATG GTCGGCTCGG TACAGGATGC 

501 CGCGCTTGCC TCAGGTGCGC CGCAGTGGTC GGGCGCGTTG CGAACGGCGG 

551 CGACGCTGAC CTTCATGACG CTTTTGCTGT GGGGGCTGTA CCGCTTCGTG 

20 601 CCAAACCGCT TCGTTCCCGC GCGGCAGGCG TTTGTCGGGG CTTTGGCAAC 

651 AGCGTTTTGT CTGGAAACCG CGCGCTCCCT CTTCACTTGG TATATGGGCA 

701 ATTTCGACGG CTACCGCTCG ' ATTTACGGCG CGTTTGCCGC CGTGCCGTTT 

751 TTTCTGTTGT GGCTGAACCT GTTGTGGACG CTGGTCTTGG GCGGCGCGGT 

801 GCTGACTTCT TCACTCTCCT ACTGGCAGGG AGAAGCGTTC CGCAGGGGCT 

25 .851 TCGACTCGCG CGGACGGTTT GACGACGTGT TGAAAATCCT GCTGCTTCTG 

901 GATGCGGCGC AAAAAGAAGG CAAAGCCTTG CCTGTTCAGG AGTTCAGACG 

951 GCATATCAAT ATGGGCTACG ACGAGTTGGG CGAGCTTTTG GAAAAGCTGG 

1001 CGCGGCACGG CTACATCTAT TCCGGCAGAC AGGGTTGGGT GTTGAAAACG 

1051 GGGGCGGATT CGATTGAGTT GAACGAACTC TTCAAGCTCT TCGTTTACCG 

30 1101 TCCGTTGCCT GTGGAAAGGG ATCATGTGAA CCAAGCTGTC GATGCGGTAA 

1151 TGACACCGTG TTTGCAGACT TTGAACATGA CGCTGGCAGA GTTTGACGCT 

12 01 CAGGCGAAAA AACGGCAGTA G 

This corresponds to the amino acid sequence [<SEQ ID 622; ORF144-l>] (SEP ID NO: 622: 
35 PRF144-1) : 

1 MTFLQRLQGL ADNKI CAFA W FWRRFDEER VPQAAASMTF TT LLALVPVL 

51 TVMVAVASI F PVFDRWSDSF VSFVNQTIVP QGADMVFD Y I NAFREQANRL 

101 TAIGSVMLW TSLMLI RTID NTFNRIWRVN SQRPWMMQFL VYW ALLTFGP 

151 LSLGVGISFM V GSVQDAALA SGAPQWSGAL RTAATLTFMT LLLWGLYRFV 

40 201 PNRFVPARQA FVGALATAFC LETARSLFTW YMGNFDGYRS IYGAFAAVPF 

2 51 FLLWLNLLWT LVL GGAVLTS SLSYWQGEAF RRGFDSRGRF DDVLKILLLL 

301 DAAQKEGKAL PVQEFRRHIN MGYDELGELL EKLARHGYIY SGRQGWVLKT 

351 GADSIELNEL FKLFVYRPLP VERDHVNQAV DAVMTPCLQT LNMTLAEFDA 

4 01 QAKKRQ* 

45 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N. meningitidis (strain A) 
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ORF144 (SEP ID NO: 620) shows 96.3% identity over a 136aa overlap with an ORF (ORF144a) 
(SEP ID NO: 624) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 144 . pep MTFLQRLQGLADNKICAFAW FVVRRFDEERVPQXAASMTFTT LLALVPVLTVMVAVASI F 

5 1 1 1! 1 1 Ml 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I M I 

orf 144a MTFLQRLQGLADNKICAFAW FWRRFDEERVPQAAASMTFTT LLALVPVLTVMVAVASI F 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 144 .pep P VFDRWSDS FYS FVNQT I VPXGADMVFD Y I NAFREQANR LTA I GSVMLWTS LML IRTID 

10 1 1 II 1 1 1 ! 1 1 II 1 1 1 1 1 1 1 1 III lllllllllllllllll IIIIMM lllllll 

. orf 144a P VFDRWSDS FVS FVNQT I VPQGADMVFD Y I NAFREQANR LTA I GS VML WTSXML I RT I D 

70 * 80 90 100 110 120 

130 

orf 144 .pep NTFNRIWRVXXQRPWM 

15 I I I I I I I II I I I I I 

orf 144a NTFNRIWRVNSQRPWMMQFLVYW ALLTFGPLSLGVGISFXV GSVQDAALASGAPQWSGAL 

130 140 150 160 170 180 

The complete length ORF144a nucleotide sequence [<SEQ ID 623>] (SEP ID NO: 623) is: 

20 1 ATGACCTTTT TACAACGTTT GCAAGGTTTG GCAGACAATA AAATCTGTGC 

51 GTTTGCATGG TTCGTCGTCC GCCGCTTTGA TGAAGAACGC GTACCGCAGG 

101 CGGCGGCAAG CATGACGTTT ACGACACTGC TGGCACTCGT CCCCGTGCTG 

151 ACCGTGATGG TGGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGNTGGTC 

201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CAGGGCGCGG 

25 251 ACATGGTNTT CGACTATATC AATGCGTTCC GCGAGCAGGC GAACCGGCTG 

301 ACGGCAATCG GCAGCGTGAT GCTGGTCGTT ACCTCGCNGA TGCTGATTCG 

351 GACGATAGAC AATACGTTCA ACCGCATCTG GCGGGTCAAT TCCCAGCGTC 

401 CGTGGATGAT GCAGTTTCTC GTCTATTGGG CTTTACTGAC GTTCGGGCCG 

451 CTGTCTTTGG GCGTGGGCAT TTCCTTTATN GTCGGCTCGG TACAGGATGC 

30 501 CGCGCTTGCC TCAGGTGCGC CGCAGTGGTC GGGCGCGTTG CGAACGGCGG 

551 CGACGCTGAN CTTCATGACG CTTTTGCTGT GGGGGCTGTA CCGCTNCGTG 

601 CCAAACCGCT TCGTTCCCGC GCGGCANGCG TTTGTCGGGG CTTTGGCAAC 

651 AGCGTTCTGT CTGGAAACCG CGCGTTCCCT CTTTACTTGG TATATGGGCA 

701 ATTTCGACGG CTACCGCTCG ATTTACGGNG CGTTTGCCGC CGTGCCGTTT 

35 751 TTTCTGTTGT GGCTGAACCT GTTGTGGACG CTGGTCTTGG GCGGCGCGGT 

801 GCTGACTTCT TCACTCTCCT ACTGGCAGGG AGAAGCGTTC CGCAGGGNCT 

851 TCGACTCGCG CGGACGGTTT GACGACGTGT TGAAAATCCT GCTGCTTCTG 

901 GATGCGGCGO AAAAAGAAGG CNAAGCCTTG CCTGTTCAGG AGTTCAGACG 

951 GCATATCAAT ATGGGCTACG ACGAGTTGGG CGAGCTTTTG GAAAAGCTGG 

40 1001 CGCGGCACGG CTACATCTAT TCCGGCAGAC AGGGTTGGGT GTTGAAAACG 

1051 GGGGCGGATT CGATTGAGTT GAACGAACTC TTCAAGCTCT TCGTTTACCG 

1101 TCCGTTGCCT GTGGAAAGGG ATCATGTGAA CCAAGCTGTC GATGCGGTAA 

1151 TGATGCCGTG TTTGCAGACT TTGAACATGA CGCTGGCAGA GTTTGACGCT 

1201 CAGGCGAAAA AACAGCAGCA ATCTTGA 



45 



This encodes a protein having amino acid sequence [<SEQ ID 624>] (SEP ID NP: 624) : 



1 MTFLQRLQGL ADNKICAFA W FWRRFDEER VPQAAASMTF TTLLALVPVL 

51 TVMVAVASI F PVFDRWSDSF VSFVNQTIVP QGADMVFDYI NAFREQANRL 

101 TAIGSVMLW TSXMLI RTID NTFNRIWRVN SQRPWMMQFL VYW ALLTFGP 

50 151 LSLGVGISFX V GSVQDAALA SGAPQWSGAL RTAATLXFMT LLLWGLYRXV 

201 PNRFVPARXA FVGALATAFC LETARSLFTW YMGNFDGYRS IYGAFAAVPF 
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251 FLLWLNLLWT LVL GGAVLTS SLSYWQGEAF RRXFDSRGRF DDVLKILLLL 
301 DAAQKEGXAL PVQEFRRHIN MGYDELGELL EKLARHGYIY SGRQGWVLKT 
351 GADSIELNEL FKLFVYRPLP VERDHVNQAV DAVMMPCLQT LNMTLAEFDA 
4 01 QAKKQQQS* 

5 

ORF144a (SEP ID NO: 624) and ORF144-1 (SEP ID NO: 622) show 97.8% identity in 406 aa 
overlap: 

orf 144a . pep MTFLQRLQGLADNKICAFAWFWRRFDEERVPQAAASMTFTTLLALVPVLTVMVAVASIF 

HIM Mill MHIM III III IMIIMM III IIHII MM Ml II I II MMI 

10 orf 144 - 1 MTFLQRLQGLADNKI CAFAWFVWRFDEERVPQAAASMTFTTLLALVPVLTVMVAVAS I F 

orf 144a . pep PVFDRWSDSFVSFVNQTIVPQGADMVFDYINAFREQANRLTAIGSVMLWTSXMLIRTID 

lllllll IIIMII lllllll IMIMIMI lllllll IIIIIIIIIIMII lllllll 

orf 144-1 PVFDRWSDSFVSFVNQTIVPQGADMVFDYINAFREQANRLTAIGSVMLWTSLMLIRTID 



15 



20 



25 



orf 144a . pep NTFNRIWRVNSQRPWMMQFLVYWALLTFGPLSLGVGISFXVGSVQDAALASGAPQWSGAL 

lllllll II I I III III lllllll IIIIIIIIIIMII I Hlllllll III II II III I 
orf 144 - 1 NTFNRIWRVNSQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDAALASGAPQWSGAL 

orf 144a . pep rtaatlxfmtlllwglyrxvpnrfvparxafvgalatafcletarslftwymgnfdgyrs 
lllllhlllllllllll Hlllllll MINIMI IIIMIIIIII IMIIIIIII 

or f 14 4 - 1 RTAATLTFMTLLLWGLYRFVPNRFVPARQAFVGALATAFCLETARSLFTWYMGNFDGYRS 

orf 144a . pep IYGAFAAVPFFLLWLNLLWTLVLGGAVLTSSLSYWQGEAFRRXFDSRGRFDDVLKILLLL 

I h I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I II 
orf 144-1 I YGAFAAVPFFLLWLNLLWTLVLGGAVLTSSLSYWQGEAFRRGFDSRGRFDDVLKILLLL 

orf 144a . pep DAAQKEGXAL PVQEFRRH I NMGYDELGELLEKLARHGY I YSGRQGWVLKTGADS I ELNEL 

lllllll 1 1 1 1! 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

or f 1 4 4 - 1 DAAQKEGKAL PVQE FRRH I NMG YDELGELLE KLARHGY I YSGRQGWVLKTGADS I ELNEL 

orf 144a . pep FKLFVYRPLPVERDHVNQAVDAVMMPCLQTLNMTLAEFDAQAKKQQQS 408 

II llllllllllllllllll II 1 I 1 I I I I I 1 I I I I t I I 1 1 I = I 

orf 144-1 FKLFVYRPLPVERDHVNQAVDAVMTPCLQTLNMTLAEFDAQAKKRQ 4 06 



Homology with a predicted ORF from N. gonorrhoeae 



30 ORF144 (SEP ID NO: 620) shows 91.2% identity over a 136aa overlap with a predicted ORF 
(ORF144ng) (SEP ID NP: 626) from N. gonorrhoeae: 



orf 144 . pep MTFLQRLQGLADNKI CAFAWFWRRFDEERVPQXAASMTFTTLLALVPVLTVMVAVAS IF 60 

Mill II II 1 1 1 Mill MINIMI I II 1 1 1 1 1 1! HI II III 1 1 1 1 1 II 

or f 1 4 4 ng MTFLQCWQGS ADNKI CAFAWFVI RRFSEERVPQAAASMTFTTLLALVPVLTVMVAVAS IF 60 

35 orf 144 .pep PVFDRWSDSFVSFVNQTIVPXGADMVFDYINAFREQANRLTAIGSVMLWTSLMLIRTID 120 

llllllllllllllllll II lllllhllMUillllllllllllllll II 

or f 1 4 4 ng PVFDRWS DS FVS FVNQT I VPQGADMVFD Y I DAFRDQANRLTA I GSVMLWTS LMLIRTID 120 

orf 144. pep NTFNRIWRVXXQRPWM 136 
hlllllll HIM 

40 orf 144ng NAFNRIWRVNTQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDSVLSSGAQQWADAL 180 



CHIR-0160 (356.001) 



-453- 



PATENT 



The complete length ORF144ng nucleotide sequence [<SEQ ID 625>] (SEP ID NO: 625) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 626>] (SEP ID NO: 626) : 



1 MTFLQCWQGS ADNKI CAFAW FVIRRFSEER VPQAAASMTF TT LLALVPVL 

51 TVMVAVASI F PVFDRWSDSF VSFVNQTIVP QGADMVFDYI DAFRDQANRL 

101 TAIGSVMLW TSLMLI RTID NAFNRIWRVN TQRPWMMQFL VYWALLTFGP 

151 LSLGVGISFM V GSVQDSVLS SGAQQWADAL KTAARLAFMT LLLWGLYRFV 

201 PNRFVPARQA FVGALITAFC LETARFLFTW YMGNFDGYRS I YGAFAAVPF 

251 FLLWLNLLWT LVL GGAVLTS SLSYWQGEAF RRGFDSRGRF DDVLKILLLL 

3 01 DAAQKEGRTL SVQEFRRHIN MGYDELGELL EKLARYGYIY SGRQGWVLKT 
351 GADS I ELS EL FKLFVYRPLP VERDHVNQAV DAVMTPCLQT LNMTLAEFDA 

4 01 QAKKQQQS * 

Further work revealed the following gonococcal DNA sequence [<SEQ ID 627>] (SEP ID NO: 
627) : 



1 ATGACCTTTT TACAACGTTG GCAAGGTTTG GCGGACAATA AAATCTGTGC 

51 ATTTGCATGG TTCGTCATCC GCCGTTTCAG TGAAGAGCGC GTACCGCAGG 

* 101 CAGCGGCGAG CATGACGTTT ACGACACTGC TGGCACTCGT CCCCGTACTG 

151 ACCGTAATGG TCGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGCTGGTC 

201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CAGGGCGCGG 

251 ATATGGTGTT CGACTATATC GACGCATTCC GCGATCAGGC AAACCGGCTG 

301 ACCGCCATCG GCAGCGTGAT GCTGGTCGTA ACCTCGCTGA TGCTGATTCG 

351 GACGATAGAC AATGCGTTCA ACCGCATCTG GCGGGTTAAC ACGCAACGCC 

4 01 CCTGGATGAT GCAGTTCCTC GTTTATTGGG CGTTGCTGAC TTTCGGGCCT 

4 51 TTGTCTTTGG GTGTGGGCAT TTCCTTTATG GTCGGGTCGG TTCAAGACTC 

501 CGTACTCTCC TCCGGAGCGC AACAATGGGC GGACGCGTTG AAGACGGCGG 

551 CAAGGCTGGC TTTCATGACG CTTTTGCTGT GGGGGCTGTA CCGCTTCGTG 

601 CCCAACCGCT TCGTGCCCGC CCGGCAGGCG TTTGTCGGAG CTTTGATTAC 

651 GGCATTCTGC CTGGAGACGG CACGTTTCCT GTTCACCTGG TATATGGGCA 

701 ATTTCGACGG CTACCGCTCG ATTTACGGCG CATTTGCCGC CGTGCCGTTT 

751 TTCCTGCTGT GGTTAAACCT GCTGTGGACG CTGGTCTTGG GCGGGGCGGT 

801 GCTGACTTCG TCGCTGTCTT ATTGGCAGGG CGAGGCCTTC CGCAGGGGAT 

851 TCGACTCGCG CGGACGGTTT GACGACGTGT TGAAAATCCT GCTGCTTCTG 

901 GATGCGGCGC AAAAAGAAGG CCGAACCCTG TCCGTTCAGG AGTTCAGACG 

951 GCATATCAAT ATGGGTTACG ATGAATTGGG CGAGCTTTTG GAAAAGCTGG 

1001 CGCGGTACGG CTATATCTAT TCCGGCAGAC AGGGCTGGGT TTTGAAAACG 

1051 GGGGCGGATT CGATTGAGTT GAGCGAACTC TTCAAGCTCT TCGTGTACCG 

1101 CCCGTTGCct gtggaAAGGG ATCATGTGAA CCAAGCTGtc gaTGCGGTAA 

1151 TGAcgccgtG TTTGCAGACT TTGAACATGA CGCTGGCGGA GTTTGACGCT 

12 01 CAGgcgAAAA AACAGCAGCA GTCTTGA 

This encodes a variant of PRF144ng, having the amino acid sequence [<SEQ ID 628; ORF144ng- 
1>] rSEP ID NO: 628: ORF144ng-l) : 



1 MTFLQRWQGL ADNKI CAFA W FVIRRFSEER VPQAAASMTF TT LLALVPVL 

51 TVMVAVASI F PVFDRWSDSF VSFVNQTIVP QGADMVFDYI DAFRDQANRL 

101 TAIGSVMLW TSLMLI RTID NAFNRIWRVN TQRPWMMQFL VYW ALLTFGP 

151 LSLGVGISFM V GSVQDSVLS SGAQQWADAL KTAARLAFMT LLLWGLYRFV 

2 01 PNRFVPARQA FVGALITAFC LETARFLFTW YMGNFDGYRS I YGAFA AVPF 
251 FLLWLNLLWT LVL GGAVLTS SLSYWQGEAF RRGFDSRGRF DDVLKILLLL 
301 DAAQKEGRTL SVQEFRRHIN MGYDELGELL EKLARYGYIY SGRQGWVLKT 

3 51 GADS I ELS EL FKLFVYRPLP VERDHVNQAV DAVMTPCLQT LNMTLAEFDA 

4 01 QAKKQQQS* 
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ORF144ng-l (SEP ID NO: 628) and ORF144-1 (SEP ID NO: 622) show 94.1% identity in 406 aa 
overlap: 

orf 144ng-l .pep MTFLQRWQGLADNKICAFAWFVIRRFSEERVPQAAASMTFTTLLALVPVLTVMVAVASIF 

5 IIIMI MINI IIMIMI MIMIMIIIIIMIMII IMMMIIMIMI II 

orf 144 - 1 MTFLQRLQGLADNKICAFAWFVTORFDEERVPQAAASMTFTTLLALVPVLTVMVAVASIF 
orf 144ng-l .pep P VFDRWSDS FVS FVNQT I VPQGADMVFDY I DAFRDQANRLTAI GS VML WTSLML I RT I D 

iiiiiiiiii iiiiiiiiiiiiiiiiiiihi ihiiiiiiiiiiiiiiiiiii imii 

orf 144 - 1 PVFDRWSDS FVS FVNQT I VPQGADMVFDY INAFREQANRLTAIGSVMLWTSLMLIRTID 

10 orf 144ng-l .pep NAFNRIWRVNTQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDSVLSSGAQQWADAL 

hi I I I I II i : I I I I I I I I I I I I I I I I I I I I I M I I i I I I I I I I - hi I I Ih II 
orf 144 - 1 NTFNRIWRVNSQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDAALASGAPQWSGAL 

orf 144ng- 1 . pep KTAARLAFMTLLLWGLYRFVPNRFVPARQAFVGALITAFCLETARFLFTWYMGNFDGYRS 

: | I I hllllllllllllllllllllllllllll IIIIIIIM llllllllllllll 
1 5 or f 14 4 - 1 RTAATLTFMTLLLWGLYRFVPNRFVPARQAFVGALATAFCLETARSLFTWYMGNFDGYRS 

orf 144ng- 1 . pep I YGAFAAVP FFLLWLNLLWTLVLGGAVLTS S LS YWQGEAFRRGFDSRGRFDDVLKI LLLL 
I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
orf 144 - 1 I YGAFAAVP FFLLWLNLLWTLVLGGAVLTS SLS YWQGEAFRRGFDSRGRFDDVLKI LLLL 

orf 144ng-l .pep DAAQKEGRTLS VQE FRRH INMGYDELGELLEKLARYGY I YSGRQGWVLKTGADS I ELS EL 

20 1 1 1 1 1 1 h = I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 h 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 h 1 1 

orf 144-1 DAAQKEGKALPVQEFRRHINMGYDELGELLEKLARHGYIYSGRQGWVLKTGADSIELNEL 

orf 144ng- 1 . pep FKLFVYRPLPVERDHVNQAVDAVMTPCLQTLNMTLAEFDAQAKKQQQS 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I j I I I I I I I I I I I I I h I 
or f 14 4 - 1 FKLFVYRPLPVERDHVNQAVDAVMTPCLQTLNMTLAEFDAQAKKRQ 

25 

On this basis of this analysis, including the identification of several putative transmembrane 
domains in the gonococcal protein, it is predicted that the proteins from N .meningitidis and 
N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
raising antibodies. 

30 Example 75 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 629>] (SEP ID 
NO: 629) : 



1 . .AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA 

51 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA 

35 101 GCACCGATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC 

151 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG 

201 CCTGCTTGAA ACACGGGAAC ACGGCTGA 
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This corresponds to the amino acid sequence [<SEQ ID 630; ORF146>] (SEP ID NO: 630; 
PRF146) : 



1 . .RHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTDMRQE ISALVILLQR 
51 TRRKWLDAHE RQHLRQSLLE TREHG* 

Further work revealed the complete nucleotide sequence [<SEQ ID 63 1>] (SEP ID NO: 631) : 



1 ATGAACACCT CGCAACGCAA CCGCCTCGTC AGCCGCTGGC TCAACTCCTA 

51 CGAACGCTAC CGCTACCGCC GCCTCATCCA CGCCGTCCGG CTCGGCGGGG 

101 CCGTCCTGTT CGCCACCGCC TCCGCCCGGC TGCTCCACCT CCAACACGGC 

151 GAGTGGATAG GGATGACCGT CTTCGTCGTC CTCGGCATGC TCCAGTTTCA 

201 AGGGGCGATT TACTCCAAGG CGGTGGAACG TATGCTCGGC ACGGTCATCG 

251 GGCTGGGCGC GGGTTTGGGC GTTTTATGGC TGAACCAGCA TTATTTCCAC 

301 GGCAACCTCC TCTTCTACCT CACCGTCGGC ACGGCAAGCG CACTGGCCGG 

351 CTGGGCGGCG GTCGGCAAAA ACGGCTACGT CCCTATGCTG GCAGGGCTGA 

4 01 CGATGTGTAT GCTCATCGGC GACAACGGCA GCGAATGGCT CGACAGCGGA 

451 CTCATGCGCG CCATGAACGT CCTCATCGGC GCGGCCATCG CCATCGCCGC 

501 CGCCAAACTG CTGCCGCTGA AATCCACACT GATGTGGCGT TTCATGCTTG 

551 CCGACAACCT GGCCGACTGC AGCAAAATGA TTGCCGAAAT CAGCAACGGC 

601 AGGCGCATGA CCCGCGAACG CCTCGAGGAG AACATGGCGA AAATGCGCCA 

651 AATCAACGCA CGCATGGTCA AAAGCCGCAG CCATCTCGCC GCCACATCGG 

701 GCGAAAGCCG CATCAGCCCC GCCATGATGG AAGCCATGCA GCACGCCCAC 

751 CGTAAAATCG TCAACACCAC CGAGCTGCTC CTGACCACCG CCGCCAAGCT 

801 GCAATCTCCC AAACTCAACG GCAGCGAAAT CCGGCTGCTT GACCGCCACT 

851 TCACACTGCT CCAAACCGAC . CTGCAACAAA CCGTCGCCCT TATCAACGGC 

901 AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA 

951 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA 

1001 GCACCAATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC 

1051 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG 

1101 CCTGCTTGAA ACACGGGAAC ACGGCTGA 

This corresponds to the amino acid sequence [<SEQ ID 632; ORF146-l>] (SEP ID NO: 632; 
PRF146-1) : 



1 MNTSQRNRLV SRWLNSYERY RYRRLIHAVR LGGAVLFATA SARLLHLQHG 

51 EWIGMTVFW LGMLQFQGAI YSKAVERMLG TVIGLGAGLG VLWLNQHYFH 

101 GNLLFYLTVG TASALAGWAA VGKNGYVPML AGLTMCMLIG DNGSEWLDSG 

151 LMRAM NVLIG AAIAIAAAKL LPL KSTLMWR FMLADNLADC SKMIAEISNG 

201 RRMTRERLEE NMAKMRQINA RMVKSRSHLA ATSGESRISP AMMEAMQHAH 

251 RKIVNTTELL LTTAAKLQSP KLNGSEIRLL DRHFTLLQTD LQQTVALING 

301 RHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTNMRQE ISALVILLQR 

351 TRRKWLDAHE RQHLRQSLLE TREHG* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N. meningitidis (strain A) 



PRF146 (SEP ID NP: 630) shows 98.6% identity over a 74aa overlap with an PRF (PRF146a) 
(SEP ID NP: 634) from strain A of N. meningitidis: 
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10 20 30 

orf 146 .pep RHARRIRIDTAINPELEALAEHLHYQWQGF 

Illlllllllllllllllllllllllllll 
Orf 146a KLNGSEIRLLDRHFTLLQTDLQQTVALINGRHARRIRIDTAINPELEALAEHLHYQWQGF 
5 280 290 300 310 320 330 

40 50 60 70 

orf 14 6. pep LWLSTDMRQEISALVILLQRTRRKWLDAHERQHLRQSLLETREHGX 

M I ' - ! I ! 1 1 1 1 1 1 1 M M 1 1 i 1 1 1 ' 1 1 1 1 1 1 II 1 1 1 1 i 1 1 h 

orf 14 6a LWLSTNMRQE I SALVI LLQRTRRKWLDAHERQHLRQSLLETREHSX 

10 340 350 360 370 

The complete length ORF146a nucleotide sequence [<SEQ ID 633>] (SEP ID NO: 633) is: 

1 ATGAACACCT CGCAACGCAA CCGCCTCGTC AGCCGCTGGC TCAACTCCTA 

51 CGAACGCTAC CGCTACCGCC GCCTCATCCA CGCCGTCCGG CTCGGCGGGG 

15 .101 CCGTCCTGTT CGCCACCGCC TCCGCCCGGC TGCTCCACCT CCAACACGGC 

151 GAGTGGATAG GGATGACCGT CTTCGTCGTC CTCGGCATGC TCCAGTTTCA 

201 AGGGGCGATT TACTCCAAGG CGGTGGAACG TATGCTCGGC ACGGTCATCG 

251 GGCTGGGCGC GGGTTTGGGC GTTTTATGGC TGAACCAGCA TTATTTCCAC 

301 GGCAACCTCC TCTTCTACCT CACCGTCGGC ACGGCAAGCG CACTGGCCGG 

20 351 CTGGGCGGCG GTCGGCAAAA ACGGCTACGT CCCTATGCTG GCGGGGCTGA 

4 01 CGATGTGCAT GCTCATCGGC GACAACGGCA GCGAATGGTT CGACAGCGGC 

451 CTGATGCGCG CGATGAACGT CCTCATCGGC GCGGCCATCG CCATCGCCGC 

501 CGCCAAACTG CTGCCGCTGA AATCCACACT GATGTGGCGT TTCATGCTTG 

551 CCGACAACCT GACCGACTGC AGCAAAATGA TTGCCGAAAT CAGCAACGGC 

25 601 AGGCGCATGA CCCGCGAACG CCTCGAAGAG AACATGGCGA AAATGCGCCA 

651 AATCAACGCA CGCATGGTCA AAAGCCGCAG CCACCTCGCC GCCACATCGG 

701 GCGAAAGCCG CATCAGCCCC GCCATGATGG AAGCCATGCA GCACGCCCAC 

751 CGTAAAATTG TCAACACCAC CGAGCTGCTC CTGACCACCG CCGCCAAGCT 

801 GCAATCTCCC AAACTCAACG GCAGCGAAAT CCGGCTGCTT GACCGCCACT 

30 851 TCACACTGCT CCAAACCGAC CTGCAACAAA CCGTCGCCCT TATCAACGGC 

901 AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA 

951 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA 

1001 GCACCAATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC 

1051 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG 

35 1101 CCTGCTTGAA ACACGGGAAC ACAGTTGA 

This encodes a protein having amino acid sequence [<SEQ ID 634>] (SEP ID NO: 634) : 

1 MNTSQRNRLV SRWLNSYERY RYRRLIHAVR LGGAVLFATA SARLLHLQHG 

51 EWIGMTVFW LGMLQFQGAI YSKAVERMLG TVIGLGAGLG VLWLNQHYFH 

40 101 GNLLFYLTVG TASALAGWAA VGKNGYVPML AGLTMCMLIG DNGSEWFDSG 

151 LMRAMN VLIG. AAIAIAAAKL LPL KSTLMWR FMLADNLTDC SKMIAEISNG 

201 RRMTRERLEE NMAKMRQINA RMVKSRSHLA ATSGESRISP AMMEAMQHAH 

251 RKIVNTTELL LTTAAKLQSP KLNGSEIRLL DRHFTLLQTD LQQTVALING 

301 RHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTNMRQE ISALVILLQR 

45 351 TRRKWLDAHE RQHLRQSLLE TREHS * 

PRF146a (SEP ID NP: 634) and PRF146-1 (SEP ID NP: 632) show 99.5% identity in 374 aa 
overlap: 



or f 14 6a . pep MNTSQRNRLVSRWLNS YERYRYRRLIHAVRLGGAVLFATASARLLHLQHGEWIGMTVFW 
50 | | | | | | | | | || | | | | | | | | | | | | | | | | | | | | I II | | || | | | | | | | I I I | I II II I I I I I I 

or f 14 6 - 1 MNTSQRNRLVSRWLNS YERYRYRRLIHAVRLGGAVLFATASARLLHLQHGEWIGMTVFW 
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orf 146a . pep LGMLQFQGAI YSKAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTVGTASALAGWAA 

Mill MIMMIMIIIIIIIMIIIIMIIIMIIIMIIIIMI llllllll II 

orf 146-1 LGMLQFQGAI YSKAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTVGTASALAGWAA 

orf 14 6a .pep ■ VGKNGYVPMLAGLTMCMLIGDNGSEWFDSGLMRAMOTLIGAAIAIAAAKLLPLKSTLMWR 

5 II 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M hll 1 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 Ml 1 1 1 1 1 1 1 1 

orf 14 6-1 VGKNGYVPMLAGLTMCMLIGDNGSEWLDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWR 
orf 146a. pep FMLADNLTDCSKMIAEISNGRRMTRERLEENMAKMRQINARMVKSRSHLA^ 

I II IN hll MM I II II II Ml II II II III II II II I II 1 1 II II I II MM II I II 

orf 146-1 FMLADNLADCSKMIAEISNGRRMTRERLEENMAKMRQINARMVKSRSHLAATS 
10 orf 14 6a .pep AMMEAMQHAHRKI VNTTELLLTTAAKLQS PKLNGS E I RLLDRHFTLLQTDLQQTVAL ING 

II II II 1 1 II 1 1 II 1 1 II II II M II I II II 1 1 1 M M II II II II 1 1 II I II 1 1 1 1 M I 

orf 14 6-1 AMMEAMQHAHRKI VNTTELLLTTAAKLQS PKLNGSE I RLLDRHFTLLQTDLQQTVAL ING 

orf 146a. pep RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHE 

1 1 1 II I II 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 II 1 1 II II 1 1 1 M M II 1 1 II 1 1 1 M 1 1 II I M 

1 5 orf 146- 1 RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHE 

orf 146a. pep RQHLRQSLLETREHSX 

Illllllllllllh 
or f 1 4 6 - 1 RQHLRQSLLETREHGX 

Homology with a predicted ORF from N. gonorrhoeae 

20 ORF146 (SEP ID NO: 630) shows 97.3% identity over a 75aa overlap with a predicted ORF 
(ORF146ng) (SEP ID NO: 636) from N. gonorrhoeae: 

orf 146. pep RHARRIRIDTAINPELEALAEHLHYQWQGF 30 

I I I I I I I I I Ml I I I I I I I I II I I II I I 
orf 146ng KLNGSEIRLLDRHFTLLQTDLQQTAALINGRHARRIRIDTAINPELEALAEHLHYQWQGF 364 

25 orf 146 .pep LWLSTDMRQE I S ALV I LLQRTRRKWLDAHERQHLRQS LLETREHG 75 

I I I I hi I I I I M I I I I I I II I I I I I I I II I II I I II I I I M I 
orfl4 6ng LWLSTNMRQE I SALV I PLQRTRRKWLDAHERQHLRQS LLETREHG 409 

An PRF146ng nucleotide sequence [<SEQ ID 635>] (SEP ID NP: 635) was predicted to encode a 
30 protein having amino acid sequence [<SEQ ID 636>] (SEP ID NP: 636) : 



1 MSGVRFPSPA PIPSTDPPSG SLCFFTFPLQ TASDMNSSQR KRLSGRWLNS 

51 YERYRHRRLI HAVRLGGTVL FATALARLLH LQHGEW IGMT VFWLGMLQF 

101 QGA IYSNAVE R MLGTVIGLG AGLGVLWL NQ HYFHGNLLFY LTIGTASALA 

151 GWAAVGKNGY VPMLAGLTMC MLIGDNGSEW LDSGLMRAMN VLIGAAIAIA 

35 201 AAKLLPLKST LMWRFMLADN LADCSKMIAE I SNGRRMTRE RLEQNMVKMR 

251 QINARMVKSR SHLAATSGES RISPSMMEAM QHAHRKIVNT TELLLTTAAK 

301 LQS PKLNGSE IRLLDRHFTL LQTDLQQTAA LINGRHARRI RIDTAINPEL 

351 EALAEHLHYQ WQGFLWLSTN MRQEISALVI PLQRTRRKWL DAHERQHLRQ 

4 01 SLLETREHG* 

40 

Further work revealed the following gonococcal DNA sequence [<SEQ ID 637>] (SEP ID NP: 
637): 
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1 ATGAACTCCT CGCAACGCAA ACGCCTTTCC GgccGCTGGC TCAACTCCTA 

51 CGAACGCTac cGCCaccGCC GCCTCATACA TGCCGTGCGG CTCGGCggaa 

101 ccgtCCTGTT CGCCACCGCA CTCGCCCGgc tACTCCACCT CCAacacggc 

151 gAATGGATAG GGAtgaCCGT CTTCGTCGTC CTCGGCATGC TCCAGTTCCA 

201 AGGCgcgatt tActccaacg cggtgGAacg taTGctcggt acggtcatcg 

251 ggctgGGCGC GGGTTTGGgc gTTTTATGGC TGAACCAGCA TTAtttccac 

301 ggcaacCTcc tcttctacct gaccatcggc acggcaagcg cactggccgg 

351 ctGGGCGGCG GTCGGCAAAA acggctacgt ccctatgctg GCGGGGctgA 

4 01 CGATGTGCAT gctcatcggc gACAACGGCA GCGAATGGCT CGACAGCGGC 

4 51 CTGATGCGCG CGATGAACGT CCTCATCGGC GCCGCCATCG CCATTGCCGC 

501 CGCCAAACTG CTGCCGCTGA AATCCACACT GATGTGGCGT TTCATGCTTG 

551 CCGACAACCT GGCCGACTGC AGCAAAATGA TTGCCGAAAT CAGCAACGGC 

601 AGGCGTATGA CGCGCGAACG TTTGGAGCAG AATATGGTCA AAATGCGCCA 

651 AATCAACGCA CGCATGGTCA AAAGCCGCAG CCACCTCGCC GCCACATCGG 

701 GCGAAAGCCG CATCAGCCCC TCCATGATGG AAGCCATGCA GCACGCCCAC 

751 CGCAAAATCG TCAACACCAC CGAGCTGCTC CTGACCACCG CCGCCAAGCT 

801 GCAATCTCCC AAACTCAACG GCAGCGAAAT CCGGCTGCTC GACCGCCACT 

851 TCACACTGCT CCAAACCGAC CTGCAACAAA CCGCCGCCCT CATCAACGGC 

901 AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA 

951 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA 

1001 GCACCAATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC 

1051 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG 

1101 CCTGCTTGAA ACACGGGAAC ACGGCTGA 

This corresponds to the amino acid sequence [<SEQ ID 638; ORF146ng-l>] (SEP ID NO: 638; 
PRF146ng-l) : 



i 

51 
101 
151 
201 
251 
301 
351 



MNSSQRKRLS 
EWIGMTVFW 
GNLLFYLTIG 
LMRAMNVLIG 



GRWLNSYERY 
LGMLQFQGAI 
TASALAGWAA 
AAIAIAAAKL 



RRMTRERLEQ 
RKIVNTTELL 
RHARRIRIDT 
TRRKWLDAHE 



NMVKMRQINA 
LTTAAKLQSP 
AINPELEALA 
RQHLRQSLLE 



RHRRLIHAVR 
YSNAVERMLG 
VGKNGYVPML 
LPL KSTLMWR 
RMVKSRSHLA 
KLNGSEIRLL 
EHLHYQWQGF 
TREHG* 



LGGTVLFATA 
TVIGLGAGLG 
AGLTMCMLIG 
FMLADNLADC 
ATSGESRISP 
DRHFTLLQTD 
LWLSTNMRQE 



LARLLHLQHG 
VLWLNQHYFH 
DNGSEWLDSG 
SKMIAEISNG 
SMMEAMQHAH 
LQQTAALING 
ISALVILLQR 



ORF146ng-l (SEP ID NO: 638) and ORF146-1 (SEP ID NO: 632) show 96.5% identity in 375 aa 
overlap 



orf 146- 1 .pep 
orf 146ng-l 
orf 146-1 .pep 
orf 146ng-l 
orf 146-1 .pep 
orf 146ng-l 

orf 146-1 . pep 
orf 146ng- 1 



MNTSQRNRLVSRWLNSYERYRYRRL I HAVRLGGAVLFATASARLLHLQHG EWIGMTVFW 

Ihllhll : 1 1 1 1 1 i I I h 1 1 1 1 1 1 1 1 1 1 hi M 1 1 1 i 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 M I 

MNSSQRKRLSGRWLNSYERYRHRRLIHAVRLGGTVLFATALARLLHLQHGEWIGMTVFW 
LGMLQFQGAI YSKAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTVGTASALAGWAA 

M 1 1 M 1 1 M 1 1 h 1 1 1 M M 1 1 1 1 1 1 h 1 1 M 1 1 1 1 1 1 M I M II I M M I M 1 1 1 1 

LGMLQFQGAI YSNAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTIGTASALAGWAA 
VGKNGYVPMLAGLTMCMLIGDNGSEWLDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWR 

1 1 1 1 1 ] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M ' 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 i 1 1 1 1 1 M 

VGKNGYVPMLAGLTMCMLIGDNGSEWLDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWR 
FMLADNLADCSKMIAEISNGRRMTRERLEENMAKMRQINARMVKSRSHLAATSGESRISP 

1 1 1 II 1 1 1 1 1 II I II II i II II II II hi h II I MM II M I I! II II I INN 1 1 

FMLADNLADCSKMIAEISNGRRMTRERLEQNMVraVIRQINARMVKSRSHLAATSGESRISP 



orf 146-1 .pep 



AMMEAMQHAHRKIVNTTELLLTTAAKLQSPKLNGSEIRLLDRHFTLLQTDLQQTVALING 
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orf 146ng- 1 SMMEAMQHAHRKIVNTTELLLTTAAKXQSPKLNGSEIRLLDRHFTLLQTDLQQTAALING 
orf 14 6-1. pep RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHE 

1 1 1 1 1 1 1 1 1 :| I I i 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 I M M 1 1 1 1 1 1 

5 orf 14 6ng-l RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHE 

orf 146-1 .pep RQHLRQSLLETREHGX 

Illlllllllllllll 
orfl46ng-l - .RQHLRQSLLETREHGX 

10 Furthermore, ORF146ng-l (SEP ID NO: 638) shows homology with a hypothetical Exoli protein 
(SEP ID NO: 1150) : 

sp | P33011 | YEEA_ECOLI HYPOTHETICAL 40.0 KD PROTEIN IN COBU-SBMC INTERGENIC REGION 
)gi|l736674|gnl|PID|dl016553 (D90838) ORF_ID : o348#20 ; similar to [SwissProt 
Accession Number P33011] [Escherichia coli] ) gi | 1736682 | gnl | PID | dl016560 (D90839) 
15 ORF_ID:o34 8#20; similar to [SwissProt Accession Number P33011] [Escherichia coli] 

)gi | 1788318 (AE000292) f352; 100% identical to fragment YEEA_ECOLI SW: P33011 but 
has 203 additional C-terminal residues [Escherichia coli] Length = 352 
Score = 109 bits (271) , Expect = 2e-23 

Identities = 89/347 (25%), Positives = 150/347 (42%), Gaps = 21/347 (6%) 

20 Query: 20 YRHRRLIHAVRLGGTVLFATALARLLHLQHGEWIGMTVFWLGMLQFQGAIYSNAVERML 79 

YRH R++H R+ L + RL + W +T+ V++G + F G + A ER+ 
Sbjct: 15 YRHYRIVHGTRVALAFLLTFLIIRLFTIPESTWPLVTMWIMGPISFWGNWPRAFERIG 74 

Query: 80 GTVIGLGAGLGVLWLNQHYFHGNLLFYLTIGTASALAGWAAVGKNGYVPMLAGLTMCMLI 13 9 
GTV+G GL L L L + A L GW A+GK Y +L G+T+ + + + 

25 Sbjct: 75 GTVLGS I LGLI ALQLE LISLPLMLVWCAAAMFLCGWLALGKKPYQGLLIGVTLAIW 131 

Query: 140 GDNGSEWLDSGLMRAMNVLIGXXXXXXXXKXLPLKSTLI^RFMI^NLADCSKMIAEISN 199 

G E +D+ L R+ +V++G + P ++ + WR LA +L + +++ + 

Sbjct: 132 GSPTGE-IDTALWRSGDVILGSLLAMLFTGIWPQRAFIHWRIQLAKSLTEYNRVYQSAFS 190 

Query: 200 GRRMTRERLEQNMVKMRQINARMVKSRSHLAATSGESRISPSMMEAMQHAHRKIVNXXXX 259 
30 + R RLE ++ K+ VK R +A S E+RI S+ E +Q +R +V 

Sbjct: 191 PNLLERPRLESHLQKLL TDAVKMRGL I APAS KETR I PKS I YEG I QT INRNLVCMLEL 247 

Query: 260 XXXXXXXXQSPK LNGS E I RLLDRHFXXXXXXXXXXAAL I NGRHARR I R I DT A I NP EL 316 

+ LN ++R D AL G +N + 

Sbjct: 248 Q INAYWATRPSHFVLLNAQKLR- - DTQHMMQQ I LLSLVHALYEGNPQPVFANTEKLNDAV 305 

35 Query: 317 EALAEHL - - HYQWQ GFLWLSTNMRQEISALVILLQRTRRK 354 

E L + L H+ + G++WL+ + + L L+ R RK 

Sbjct: 306 EELRQLLNNHHDLKWETP I YGYVWLNMETAHQLELLSNL I CRALRK 352 

Pn the basis of this analysis, including the identification of several transmembrane domains in the 
40 gonococcal protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 76 
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The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 639>] (SEP ID 
NO: 639) 



1 . . GCCGAAGACA CGCGCGTTAC CGCACAGCTT TTGAGCGCGT ACGGCATTCA 

51 GGGCAAACTC GTCAGTGTGC GCGAACACAA CGAACGGCAG ATGGCGGACA 

101 AGATTGTCGG CTATCTTTCA GACGGCATGG TTGTGGCACA GGTTTCCGAT 

151 GCGGGTACGC CGGCCGTGTG CGACCCGGGC GCGAAACTCG CCCGCCGCGT 

201 GCGTGAGGCC GGGTTTAAAG TCGTTCCCGT CGTGGGCGCA AC . GCGGTGA 

251 TGGCGGCTTT GAGCGTGGCC GGTGTGGAAG GATCCGATTT TTATTTCAAC 

3 01 GGTTTTGTAC CGCCGAAATC GGGAGAACGC AGGAAACTGT TTGCCAAATG 

3 51 GGTGCGGGCG GCGTTTCCTA TCGTCATGTT TGAAACGCCG CACCGCATCG 

4 01 GTGCAGCGCT TGCCGATATG GCGGAACTGT TCCCCGAACG CCGATTAATG 
4 51 CTGGCGCGCG AAATTACGAA AACGTTTGAA ACGTTCTTAA GCGGCACGGT 
501 TGGGGAAATT CAGACGGCAT TGTCTGCCGA CGGCGACCAA TCGCGCGGCG 
551 AGATGGTGTT GGTGCTTTAT CCGGCGCAGG ATGAAAAACA CGAAGGCTTG 
601 TCCGAGTCCG CGCAAAACAT CATGAAAATC CTCACAGCCG AGCTGCCGAC 
651 CAAACAGGCG GCGGAGCTTG CTGCCAAAAT CACGGGCGAG GGAAAGAAAG 
701 CTTTGTACGA T. . 



This corresponds to the amino acid sequence [<SEQ ID 640; ORF147>] (SEP ID NO: 640; 
ORF147) : 



1 . . AEDTRVTAQL LSAYGIQGKL 

51 AGTPAVCDPG AKLARRVREA 

101 GFVPPKSGER RKLFAKWVRA 

151 LAREITKTFE TFLSGTVGEI 

201 SESAQNIMKI LTAELPTKQA 



VSVREHNERQ MADKIVGYLS DGMWAQVSD 
GFKWPWGA XAVMAALSVA GVEGSDFYFN 
AFPIVMFETP HRIGAALADM AELFPERRLM 
QTALSADGDQ SRGEMVLVLY PAQDEKHEGL 
AELAAKITGE GKKALYD . . 



Further work revealed the complete nucleotide sequence [<SEQ ID 64 1>] (SEP IDNP: 640 : 

1 ATGTTTCAGA AACATTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC 

51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCGGAC ATTACCCTGC 

101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATCTGTGC CGAAGACACG 

151 CGCGTTACCG CACAGCTTTT GAGCGCGTAC GGCATTCAGG GCAAACTCGT 

201 CAGTGTGCGC GAACACAACG AACGGCAGAT GGCGGACAAG ATTGTCGGCT 

251 ATCTTTCAGA CGGCATGGTT GTGGCACAGG TTTCCGATGC GGGTACGCCG 

301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GTGAGGCCGG 

351 GTTTAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTGATG GCGGCTTTGA 

4 01 GCGTGGCCGG TGTGGAAGGA TCCGATTTTT ATTTCAACGG TTTTGTACCG 

4 51 CCGAAATCGG GAGAACGCAG GAAACTGTTT GCCAAATGGG TGCGGGCGGC 

501 GTTTCCTATC GTCATGTTTG AAACGCCGCA CCGCATCGGT GCGACGCTTG 

551 CCGATATGGC GGAACTGTTC CCCGAACGCC GATTAATGCT GGCGCGCGAA 

601 ATTACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA 

651 GACGGCATTG TCTGCCGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG 

701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCCGCG 

751 CAAAACATCA TGAAAATCCT CACAGCCGAG CTGCCGACCA AACAGGCGGC 

801 GGAGCTTGCT GCCAAAATCA CGGGCGAGGG AAAGAAAGCT TTGTACGATC 

.851 TGGCTCTGTC TTGGAAAAAC AAATAG 

This corresponds to the amino acid sequence [<SEQ ID 642; PRF147-1>] (SEP ID NP: 642; 
PRF147-1) : 



1 MFQKHLQKAS DSWGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT 
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51 RVTAQLLSAY GIQGKLVSVR EHNERQMADK IVGYLSDGMV VAQVSDAGTP 

101 AVCDPGAKLA RRVREAGFK V VPWGASAVM AALSVA GVEG SDFYFNGFVP 

151 PKSGERRKLF AKWVRAAFPI VMFETPHRIG ATLADMAELF PERRLMLARE 

201 ITKTFETFLS GTVGEIQTAL SADGNQSRGE MVLVLYPAQD EKHEGLSESA 

5 251 QNIMKILTAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with hypothetical protein ORF286 (SEP ID NO: 1151) of Exoli (accession number 
U 18997) 

10 ORF147 (SEP ID NO: 640) and Exoli ORF286 protein (SEP ID NO: 1151) show 36% aa identity 
in 237aa overlap: 

AEDTRVTAQLLSAYGIQGKLVSVREHNERQMADKI VGYLSDGMWAQVSDAGTPAVCDPG 6 0 
AEDTR T LL +GI +L ++ +HNE+Q A+ ++ L +G +A VSDAGTP + DPG 



L R RE F + GF+P KS RR 



++ +E+ HR+ +L D+ + E R ++LARE+TKT+ET VGE+ + D + 



+ +GEMVL++ + E L A + +L AELP K+AA LAA+I G K ALY 

RRKGEMVLIV-EGHKAQEEDLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALY 278 

Homology with a predicted PRF from N. meningitidis (strain A) 

25 PRF 147 (SEP ID NP: 640) shows 96.6% identity over a 237aa overlap with PRF75a (SEP ID 
NP: 290) from strain A of N. meningitidis: 

10 20 30 

or f 14 7 . pep AEDTRVTAQLLSAYGIQGKLVSVREHNERQ 

IIIIIIIIIIIIIIIIMIIIIIIIIIIII 
30 orf 75a TL YWATP I GNLAD I TLRALAVLQKAD 1 1 CAEDTRVTAQLLS AYGIQGKL VS VREHNERQ 

20 30 40 50 60 70 

40 50 60 70 80 90 

or f 14 7 . pep MADKIVGYLSDGMWAOVSDAGTPAVCDPGAKLARRVREAGF KWPWGAXAVMAALSVA 

I I 1 I I I I I I I I I I 1 1 > I i 11 1 I ! I I M I I I : I I I M ! I 1 I I lllllllll 

35 orf 75a MADKIVGYLSDGMVVAOVSDAGTPAVCDPGAKIjARRWEVGFK VVPWGASAVMAALSVA 

80 90 100 110 120 130 

100 110 120 130 140 150 

orf 14 7 . pep GVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIVMFETPHRIGAAI^^ELFPERRLM 

II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 hi I h 1 1 1 1 M I MM: 1 1 II I! 1 1 M 1 1 M 

40 orf 75a GVAGSDFYFNGFVPPKSGERRKIjFAKOTRVAFPVVMFETPHRIGATLADMAELFPERRLM 





Orf 147: 


1 




Orf 286 : 


43 


15 


Orf 147 : 


61 




Orf 286 : 


103 




Orf 147: 


121 


20 


Orf 286 : 


163 




Orf 147 : 


180 




Orf 286 : 


223 
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140 150 160 170 180 190 

160 170 180 190 200 210 

orf 147 . pep LAREITKTFETFLSGTVGEIQTALSADGDQSRGEMVLVLYPAQDEKHEGLSESAQNIMKI 

II i I I Ml i I ! I I M II I I I I I I h I I I : II I I 1 I I I I I I I I I I I M I I i i I I I M M 
5 orf 75a LAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEKHEGLSESAQNIMKI 

200 210 220 230 240 250 

220 230 
orf 14 7 .pep LTAELPTKQAAELAAKITGEGKKALYD 

M 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 

10 or f 7 5 a LTAELPTKQAAELAAKI TGEGKKALYDLALS WKNKX 

260 270 280 290 

ORF147a is identical to ORF75a (SEP ID NO: 290) , which includes aa 56-292 of ORF75 (SEP ID 
NP: 286) . 

15 Homology with a predicted ORF from N. gonorrhoeae 

PRF147 (SEP ID NO: 640) shows 94.1% identity over a 237aa overlap with a predicted ORF 
(ORF 1 47ng) (SEP ID NP: 644) from N. gonorrhoeae: 

orf 14 7 .pep AEDTRVTAQLLS AYG I QGKLVS VREHNERQ 30 

H I II I I I I I I I I I I I M I I 1 I I I I I I 
20 orfl47ng TLYWAT P I GNLAD I TLRALAVLQKAD 1 1 CAEDTRVTAQLLS AYG I QGRLVS VREHNERQ 8 5 

or f 14 7 . pep MADKI VGYLSDGMWAQVSDAGTPAVCDPGAKLARRVREAGFKVVPWGAXAVMAALSVA 9 0 

llh: : llhlllllllllMllliri IIIIIIIIIIIIIIMM lllllllll 
orf 14 7ng MADKVIGFLSDGLWAQVSDAGTPAVCDPGAKLARRVREAGFKVVPVVGASAVMAALSVA 14 5 

orf 147 .pep GVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIVMFETPHRIGAALADMAELFPERRLM 150 

25 || | | | | | | | | | | | | | | | | | | | | | | | | | | | | | : || | | | | | | | | | : | | | | I I I I I I I I I I 

orf 14 7ng GVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVVMFETPHRIGATLADMAELFPERRLM 205 

orf 14 7. pep LAREITKTFETFLSGTVGEIQTALSADGDQSRGEMVLVLYPAQDEKHEGLSESAQNIMKI 210 

NIMH IMIIIIIIIIIIIMIMI IIIIIIIIIIMIIIIMII III III 

orf 14 7ng LAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEKHEGLSESAQNAMKI 265 

30 orf 147. pep LTAELPTKQAAELAAKITGEGKKALYD ' 237 

hlllllllllMIIIIMIIIIIIII 
orfl4 7ng LAAELPTKQAAELAAKITGEGKKALYDLALSWKNK 300 

An PRF147ng nucleotide sequence [<SEQ ID 643>] (SEPIDNP: 643) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 644>] (SEP ID NP: 644) : 

35 1 MSVFQTAFFM FQKHLQKASD SWGGTLYW ATPIGNLADI TLRALAVLQK 

51- ADIICAEDTR VTAQLLSAYG I QGRLVS VRE HNERQMADKV IGFLSDGLW 

101 AQVSDAGTPA VCDPGAKLAR RVREAGF KW PWGASAVMA ALSVA GVAES 

151 DFYFNGFVPP KSGERRKLFA KWVRAAFPW MFETPHRIGA TLADMAELFP 

201 ERRLMLAREI TKTFETFLSG TVGEIQTALA ADGNQSRGEM VLVLYPAQDE 

40 251 KHEGLSESAQ NAMKILAAEL PTKQAAELAA KITGEGKKAL YDLALSWKNK 

301 * 



CHIR-0160 (356.001) 



-463- 



PATENT 



Further work revealed the following gonococcal DNA sequence [<SEQ ID 645>] fSEO ID NO: 
645) : 

1 ATGTTTCAGA AACACTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC 

51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCAGAC ATTACCCTGC 

101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATTTGTGC CGAAGACACG 

151 CGCGTTACTG CGCAGCTTTT GAGCGCGTAC GGCATTCAGG GCAGGTTGGT 

201 CAGTGTGCGC GAACACAACG AGCGGCAGAT GGCGGACAAG GTAATCGGTT 

251 TCCTTTCAGA CGGCCTGGTT GTGGCGCAGG TTTCCGATGC GGGTACGCCG 

301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GCGAAGCAGG 

3 51 GTTCAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTAATG GCGGCGTTGA 

401 GTGTGGCCGG TGTGGCGGAA TCCGATTTTT ATTTCAACGG TTTTGTACCG 

451 CCGAAATCGG GCGAACGTAG GAAATTGTTT GCCAAATGGG TGCGGGCGGC 

501 ATTTCCTGTC GTCATGTTTG AAACGCCGCA CCGAATCGGG GCAACGCTTG 

551 CCGATATGGC GGAATTGTTC CCCGAACGCC GTCTGATGCT GGCGCGCGAA 

601 ATCACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA 

651 GACGGCATTG GCGGCGGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG 

701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCTGCG 

751 CAAAATGCGA TGAAAATCCT TGCGGCCGAG CTGCCGACCA AGCAGGCGGC 

801 GGAGCTTGCC GCCAAGATTA CAGGTGAGGG CAAAAAGGCT TTGTACGATT 

851 TGGCACTGTC GTGGAAAAAC AAATGA 

This corresponds to the amino acid sequence [<SEQ ID 646; ORF147ng-l>] (SEP ID NO: 646; 
PRF147ng-l) : 



1 MFQKHLQKAS DSWGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT 

51 RVTAQLLSAY GIQGRLVSVR EHNERQMADK VIGFLSDGLV VAQVSDAGTP 

101 AVCD PGAKLA RRVREAGF KV VPWGASAVM AALSVA GVAE SDFYFNGFVP 

151 PKSGERRKLF AKWVRAAFPV VMFETPHRIG ATLADMAELF PERRLMLARE 

201 ITKTFETFLS GTVGEIQTAL AADGNQSRGE MVLVLYPAQD EKHEGLSESA 

251 QNAMKILAAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K* 

PRF147ng -l (SEO ID NO: 646) shows homology to a hypothetical E.coli protein (SEP ID NO: 
1152) : 



sp | P4 5528 | YRAL_ECOLI HYPOTHETICAL 31.3 KD PROTEIN IN AGAI-MTR INTERGENIC REGION 
(F286) 

)gi|606086 (U18997) 0RF_f286 [Escherichia coli] 

)gi | 1789535 (AE000395) hypothetical 31.3 kD protein in agai-mtr intergenic region 
[Escherichia coli] Length = 286 
Score = 218 bits (550), Expect = 3e-56 

Identities = 128/284 (45%), Positives = 171/284 (60%), Gaps. = 4/284 (1%) 

Query: 4 KHLQKASDSWGGTLYVVATP I GNLAD ITLRALAVLQ KADI I CAEDTRVTAQLLSAYG I Q 63 

K Q A +S G- LY+V TPIGNLADIT RAL VLQ D+I AEDTR T LL +GI 
Sbjct: 2 KQHQS ADNSQ - - GQLY I VPTP IGNLAD I TQRALE VLQAVDL I AAEDTRHTGLLLQHFG IN 59 

Query: 64 GRLVSWEHNERQMADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLARRVREAGFKVVPV 123 

RL ++ +HNE+Q A+ ++ L +G +A VSDAGTP + DPG L R REAG +WP+ 
Sbjct: 60 ARLFALHDHNEQQKAETLLAKLQEGQNIALVSDAGTPLINDPGYHLVRTCREAGIRWPL 119 



Query: 
Sbjct : 



124 VGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVVMFETPHRIGATL 183 

G A + ALS AG+ F + GF+P KS RR ++ +E+ HR+ +L 

12 0 PGPCAAITALSAAGLPSDRFCYEGFLPAKSKGRRDALKAIEAEPRTLIFYESTHRLLDSL 179 
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Query: 184 ADMAELFPERR-LMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEK 242 

D+ + E R ++LARE+TKT+ET VGE+ + D N+ +GEMVL++ + 

Sbjct: 180 ED I VAVLGESRYWLARELTKTWET IHGAPVGELLAWVKEDENRRKGEMVL I V - EGHKAQ 238 

Query: 243 HEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLAL 286 

EL A + +L AELP K+AA LAA+I G K ALY AL 
Sbjct: 239 EEDLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALYKYAL 282 

Based on the computer analysis and the presence of a putative transmembrane domain in the 
gonococcal protein, it is predicted that these proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 77 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 647>] (SEP ID 
NO: 647) 

1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCGAA 

51 AACCGGTCGC ATCCGCTTCT C.GCTGCTTA CTTAGCCATA TGCCTGTCGT 

101 TCGGCATTCT TCCCCAAGCC TGGGCGGGAC ACACTTATTT CGGCATCAAC 

151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG 

201 GGCGAAAGAT ATTGAGGTTT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT 

251 CAATGACAAA AGCCCCGATG ATTGATTTTT CTGTGGTGTC GCGTAACGGC 

301 GTGGCGGcAT TGGTGGGCGt ATCAATATAT TGTGAGCGTG GCACATAACG 

3 51 GCGGCTATAA CAACGTTGAT TTTGGTGCGG AAGGAAk . AA tATCCC . GAT 

4 01 CAACAwCGww TTACTTATAA AATTGTGAAA CGGAATAATT ATAAAGCAGG 
4 51 GACTAAAGGC CATCCTTATG GCGGCGATTA TCATATGCCG CGTTTGCATA 
501 AATwTGTCAC AGATGCAGAA CCTGTTGAAA TGACCAGTTA TATGGATGGG 
551 CGGAAATATA TCGATCAAAA TAATTACCCT GACCGTGTTC GTATTGGGGC 
601 AGGCAGGCAA TATTGGCGAT CTGATGAAGA TGAGCCCAAT AACCGCGAAA 

651 GTTCATATCA TATTGCAAGT 

701 GGCTC ACCAATGTTT ATCTATGATG CCCAAAAGCA 

751 AAAGTGGTTA ATTAATGGGG TATTGCAAAC GGGCAACCCC TATATAGGAA 

801 AAAGCAATGG CTTCCAGCTG GTTCGTAAAG ATTGGTTCTA TGATGAAATC 

851 TTTGCTGGAG ATACCCATTC AGTATTCTAC GAACCACGTC AAAATGGGAA 

901 ATACTCTTTT AACGACGATA ATAATGGCAC AGGAAAAATC AATGCCAAAC 

951 ATGAACACAA TTCTCTGCCT AATAGATTAA AAACACGAAC CGTTCAATTG 

1001 TTTAATGTTT CTTTATCCGA GACAGCAAGA GAACCTGTTT ATCATGCTGC 

1051 AGGTGGTGTC AACAGTTATC GACCCAGACT GAATAATGGA GAAAATATTT 

1101 CCTTTATTGA CGAAGGAAAA GGCGAATTGA TACTTACCAG CAACATCAAT 

1151 CAAGGTGCTG GAGGATTATA TTTCCAAGGA GATTTTACGG TCTCGCCTGA 

1201 AAATAACGAA ACTTGGCAAG GCGCGGGCGT TCATATCAGT GAAGACAGTA 

1251 CCGTTACTTG GAAAGTAAAC GGCGTGGCAA ACGACCGCCT GTCCAAAATC 

13 01 GGCAAAGGCA CGCTG 

// 

2101 G AT AAAG 

2151 TGACTGCTTC ATTGACTAAG ACCGACATCA GCGGCAATGT CGATCTTGCC 

2201 GATCACGCTC ATTTAAATCT CACAGGGCTT GCCACACTCA ACGGCAATCT 

2251 TAGTGCAAAT GGCGATACAC GTTATACAGT CAGCCACAAC GCCACCCAAA 

23 01 ACGGCAACCk TAgCCtCGtG G.sAATGcCC AAGCAACATT TAATCAAGCC 
2351 ACATTAAACG GCAACACATC GGCTTCgGGC AATGCTTCAT TTAATCTAAG 

24 01 CGACCACGCC GTACAAAACG GCAGTCTGAC GCTTTCCGGC AACGCTAAGG 
24 51 CAAACGTAAG CCATTCCGCA CTCAACGGTA ATGTCTCCCT AGCCGATAAG 
2501 GCAGTATTCC ATTTTGAAAG CAGCCGCTTT ACCGGACAAA TCAGCGGCGG 
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2551 CAagGATACG GCATTACACT TAAAAGACAG CGAATGGACG CTGCCGTCAg 

2601 GarCGGAATT AGGCAATTTA AACCTTGACA ACGCCACCAT TACaCTCAAT 

2651 TCCGCCTATC GCCACGATGC GGCAGGGGCG CAAACCGGCA GTGCGACAGA 

2701 TGCGCCGCGC CGCCGTTCGC GCCGTTCGCG CCGTTCCCTA TTATmCGTTA 

2751 CACCGCCAAC TTCGGTAGAA TCCCGTTTCA ACACGCTGAC GGTAAACGGC 

2801 AAATTGAACG GTCAGGGAAC ATTCCGCTTT ATGTCGGAAC TCTTCGGCTA 

2851 CCGCAGCGAC AAATTGAAGC TGGCGGAAAG TTCCGAAGGC ACTTACACCT 

2 901 TGGCGGTCAA CAATACCGGC AACGAACCTG CAAGCCTCGA ACAATTGACG 

2 951 GTAGTGGAAG GAAAAGACAA CAAACCGCTG TCCGAAAACC TTAATTTCAC 
3001 CCTGCAAAAC GAACACGTCG ATGCAGGCGC GTGG 

// 

3551 TTAGAC CGCGTATTTG CCGAAGACCG 

3601 CCGCAACGCC GTTTGGACAA GCGGCATCCG GGACACCAAA CACTACCGTT 

3651 CGCAAGATTT CCGCGCCTAC CGCCAACAAA CCGACCTGCG CCAAATCGGT 

3 701 ATGCAGAAAA ACCTCGGCAG CGGGCGCGTC GGCATCCTGT TTTCGCACAA 
3751 CCGGACCGAA AACACCTTCG ACGACGGCAT CGGCAACTCG GCACGGCTTG 
3801 CCCACGGCGC CGTTTTCGGG CAATACGGCA TCGACAGGTT CTACATCGGC 
3851 ATCAGgCGCG GGCGCGGGTT TTAGCAGCGG CAGCCTTTcA GACGGCATCG 
3 901 GAGsmAAAwT CCGCCGCCGC GTGCtGCATT ACGGCATTCA GGCACGAtAC 

3 951 CGCGCCGgtt tCggCGgATt CGGCATCGAA CCGCACATCG GCGCAACGCg 

4 001 ctATTTCGTC CAAAAAGCGG ATTACCGCTA CGAAAACGTC AATATCGCCA 
4 051 CCCCCGGCCT TGCATTCAAC CGcTACCGCG CGGGCATTAa GGCAGATTAT 
4101 TCATTCAAAC CGGCGCAACA CATTTCCATC ACGCCTTATT TGAGCCTGTC 
4151 CTATACCGAT GCCGCTTCGG GCAAAGTCCG AACACGCGTC AATACCGCCG 

42 01 TATTGGCTCA GGATTTCGGC AAAACCCGCA GTGCGGAATG GGgCGTAAAC 
4251 GCCGAAATCA AAGGTTTCAC GCTGTCCCTC CACGCTGCCG CCGCCAAAGG 

43 01 CCCGCAACTG GAAGCGCAAC ACAGCGCGGG CATCAAATTA GGCTACCGCT 
4351 GGTAA. . . 

This corresponds to the amino acid sequence [<SEQ ID 648; 0RF1>] (SEP ID NO: 648;ORFl) : 



1 MKTTDKRTTE THRKAPKTGR IRFXAAYLAI CLSFGILPQA WAGHTYFGIN 

51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSWSRNG 

101 VAALVGVQYI VSVAHNGGYN NVDFGAEGXN IXDQXRXTYK IVKRNNYKAG 

151 TKGHPYGGDY HMPRLHKXVT DAEPVEMTSY MDGRKYIDQN NYPDRVRIGA 

201 GRQYWRSDED EPNNRESSYH IAS GS PMFIYDAQKQ 

251 KWLINGVLQT GNPYIGKSNG FQLVRKDWFY DE I FAGDTHS VFYEPRQNGK 

301 YSFNDDNNGT GKINAKHEHN SLPNRLKTRT VQLFNVSLSE TAREPVYHAA 

351 GGVNSYRPRL NNGENISFID EGKGE LILTS NINQGAGGLY FQGDFTVSPE 

. 401 NNETWQGAGV HISEDSTVTW KVNGVANDRL SKIGKGTL 

// 

701 DKVTAS LTKTDISGNV DLADHAHLNL TGLATLNGNL 

751 SANGDTRYTV SHNATQNGNX SLVXNAQATF NQATLNGNTS ASGNAS FNLS 

801 DHAVQNGSLT LSGNAKANVS HSALNGNVSL ADKAVFHFES SRFTGQISGG 

851 KDTALHLKDS EWTLPSGXEL GNLNLDNATI TLNSAYRHDA AGAQTGSATD 

901 APRRRSRRSR RSLLXVTPPT SVESRFNTLT VNGKLNGQGT FRFMSELFGY 

951 RSDKLKLAES SEGTYTLAVN NTGNEPASLE QLTWEGKDN KPLSENLNFT 

. 1001 LQNEHVDAGA W 

// 

1151 LDRVFAEDR 

1201 RNAVWTSGIR DTKHYRSQDF RAYRQQTDLR QIGMQKNLGS GRVGILFSHN 

1251 RTENTFDDGI GNSARLAHGA VFGQYGIDRF YIGISAGAGF SSGSLSDGIG 

1301 XKXRRRVLHY GIQARYRAGF GGFGIEPHIG ATRYFVQKAD YRYENVNIAT 

1351 PGLAFNRYRA GIKADYSFKP AQHISITPYL SLSYTDAASG KVRTRVNTAV 

1401 LAQDFGKTRS AEWGVNAEIK GFTLSLHAAA AKGPQLEAQH SAGIKLGYRW 

1451 * 

Further sequencing analysis revealed the complete nucleotide sequence [<SEQ ID 649>] (SEP ID 
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1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCGAA 

51 AACCGGCCGC ATCCGCTTCT CGCCTGCTTA CTTAGCCATA TGCCTGTCGT 

101 TCGGCATTCT TCCCCAAGCC TGGGCGGGAC ACACTTATTT CGGCATCAAC 

151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG 

5 201 GGCGAAAGAT ATTGAGGTTT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT 

251 CAATGACAAA AGCCCCGATG ATTGATTTTT CTGTGGTGTC GCGTAACGGC 

301 GTGGCGGCAT TGGTGGGCGA TCAATATATT GTGAGCGTGG CACATAACGG 

351 CGGCTATAAC AACGTTGATT TTGGTGCGGA AGGAAGAAAT CCCGATCAAC 

4 01 ATCGTTTTAC TTATAAAATT GTGAAACGGA ATAATTATAA AGCAGGGACT 

10 4 51 AAAGGCCATC CTTATGGCGG CGATTATCAT ATGCCGCGTT TGCATAAATT 

501 TGTCACAGAT GCAGAACCTG TTGAAATGAC CAGTTATATG GATGGGCGGA 

551 AATATATCGA TCAAAATAAT TACCCTGACC GTGTTCGTAT TGGGGCAGGC 

601 AGGCAATATT GGCGATCTGA TGAAGATGAG CCCAATAACC GCGAAAGTTC 

651 ATATCATATT GCAAGTGCGT ATTCTTGGCT CGTTGGTGGC AATACCTTTG 

15 701 CACAAAATGG ATCAGGTGGT GGCACAGTCA ACTTAGGTAG TGAAAAAATT 

751 AAACATAGCC CATATGGTTT TTTACCAACA GGAGGCTCAT TTGGCGACAG 

801 TGGCTCACCA ATGTTTATCT ATGATGCCCA AAAGCAAAAG TGGTTAATTA 

851 ATGGGGTATT GCAAACGGGC AACCCCTATA TAGGAAAAAG CAATGGCTTC 

901 CAGCTGGTTC GTAAAGATTG GTTCTATGAT GAAATCTTTG CTGGAGATAC 

20 951 CCATTCAGTA TTCTACGAAC CACGTCAAAA TGGGAAATAC TCTTTTAACG 

1001 ACGATAATAA TGGCACAGGA AAAATCAATG CCAAACATGA ACACAATTCT 

1051 CTGCCTAATA GATTAAAAAC ACGAACCGTT CAATTGTTTA ATGTTTCTTT 

1101 ATCCGAGACA GCAAGAGAAC CTGTTTATCA TGCTGCAGGT GGTGTCAACA 

1151 GTTATCGACC CAGACTGAAT AATGGAGAAA ATATTTCCTT TATTGACGAA 

25 12 01 GGAAAAGGCG AATTGATACT TACGAGCAAC ATCAATCAAG GTGCTGGAGG 

1251 ATTATATTTC CAAGGAGATT TTACGGTCTC GCCTGAAAAT AACGAAACTT 

13 01 GGCAAGGCGC GGGCGTTCAT ATCAGTGAAG ACAGTACCGT TACTTGGAAA 
1351 GTAAACGGCG TGGCAAACGA CCGCCTGTCC AAAATCGGCA AAGGCACGCT 

14 01 GCACGTTCAA GCCAAAGGGG AAAACCAAGG CTCGATCAGC GTGGGCGACG 
30 14 51 GTACAGTCAT TTTGGATCAG CAGGCAGACG ATAAAGGCAA AAAACAAGCC 

1501 TTTAGTGAAA TCGGCTTGGT CAGCGGCAGG GGTACGGTGC AACTGAATGC 

1551 CGATAATCAG TTCAACCCCG ACAAACTCTA TTTCGGCTTT CGCGGCGGAC 

1601 GTTTGGATTT AAACGGGCAT TCGCTTTCGT TCCACCGTAT TCAAAATACC 

1651 GATGAAGGGG CGATGATTGT CAACCACAAT CAAGACAAAG AATCCACCGT 

35 1701 TACCATTACA GGCAATAAAG ATATTGCTAC AACCGGCAAT AACAACAGCT 

1751 TGGATAGCAA AAAAGAAATT GCCTACAACG GTTGGTTTGG CGAGAAAGAT 

1801 ACGACCAAAA CGAACGGGCG GCTCAACCTT GTTTACCAGC CCGCCGCAGA 

1851 AGACCGCACC CTGCTGCTTT CCGGCGGAAC AAATTTAAAC GGCAACATCA 

1901 CGCAAACAAA CGGCAAACTG TTTTTCAGCG GCAGACCAAC ACCGCACGCC 

40 1951 TACAATCATT TAAACGACCA TTGGTCGCAA AAAGAGGGCA TTCCTCGCGG 

2001 GGAAATCGTG TGGGACAACG ACTGGATCAA CCGCACATTT AAAGCGGAAA 

2051 ACTTCCAAAT TAAAGGCGGA CAGGCGGTGG TTTCCCGCAA TGTTGCCAAA 

2101 GTGAAAGGCG ATTGGCATTT GAGCAATCAC GCCCAAGCAG TTTTTGGTGT 

2151 CGCACCGCAT CAAAGCCACA CAATCTGTAC ACGTTCGGAC TGGACGGGTC 

45 2201 TGACAAATTG TGTCGAAAAA ACCATTACCG ACGATAAAGT GATTGCTTCA 

2251 TTGACTAAGA CCGACATCAG CGGCAATGTC GATCTTGCCG ATCACGCTCA 

23 01 TTTAAATCTC ACAGGGCTTG CCACACTCAA CGGCAATCTT AGTGCAAATG 

23 51 GCGATACACG TTATACAGTC AGCCACAACG CCACCCAAAA CGGCAACCTT 

24 01 AGCCTCGTGG GCAATGCCCA AGCAACATTT AATCAAGCCA CATTAAACGG 
50 24 51 CAACACATCG GCTTCGGGCA ATGCTTCATT TAATCTAAGC GACCACGCCG 

2501 TACAAAACGG CAGTCTGACG CTTTCCGGCA ACGCTAAGGC AAACGTAAGC 

2551 CATTCCGCAC TCAACGGTAA TGTCTCCCTA GCCGATAAGG CAGTATTCCA 

2601 TTTTGAAAGC AGCCGCTTTA CCGGACAAAT CAGCGGCGGC AAGGATACGG 

2651 CATTACACTT AAAAGACAGC GAATGGACGC TGCCGTCAGG CACGGAATTA 

55 2701 GGCAATTTAA ACCTTGACAA CGCCACCATT ACACTCAATT CCGCCTATCG 

2751 CCACGATGCG GCAGGGGCGC AAACCGGCAG TGCGACAGAT GCGCCGCGCC 

2801 GCCGTTCGCG CCGTTCGCGC CGTTCCCTAT TATCCGTTAC ACCGCCAACT 

2851 TCGGTAGAAT CCCGTTTCAA CACGCTGACG GTAAACGGCA AATTGAACGG 

2901 TCAGGGAACA TTCCGCTTTA TGTCGGAACT CTTCGGCTAC CGCAGCGACA 

60 2951 AATTGAAGCT GGCGGAAAGT TCCGAAGGCA CTTACACCTT GGCGGTCAAC 

3 001 AATACCGGCA ACGAACCTGC AAGCCTCGAA CAATTGACGG TAGTGGAAGG 
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3051 AAAAGACAAC AAACCGCTGT CCGAAAACCT TAATTTCACC CTGCAAAACG 

3101 AACACGTCGA TGCCGGCGCG TGGCGTTACC AACTCATCCG CAAAGACGGC 

3151 GAGTTCCGCC TGCATAATCC GGTCAAAGAA CAAGAGCTTT CCGACAAACT 

3201 CGGCAAGGCA GAAGCCAAAA AACAGGCGGA AAAAGACAAC GCGCAAAGCC 

5 3251 TTGACGCGCT GATTGCGGCC GGGCGCGATG CCGTCGAAAA GACAGAAAGC 

3301 GTTGCCGAAC CGGCCCGGCA GGCAGGCGGG GAAAATGTCG GCATTATGCA 

3351 GGCGGAGGAA GAGAAAAAAC GGGTGCAGGC GGATAAAGAC ACCGCCTTGG 

34 01 CGAAACAGCG CGAAGCGGAA ACCCGGCCGG CTACCACCGC CTTCCCCCGC 

3451 GCCCGCCGCG CCCGCCGGGA TTTGCCGCAA CTGCAACCCC AACCGCAGCC 

10 3501 CCAACCGCAG CGCGACCTGA TCAGCCGTTA TGCCAATAGC GGTTTGAGTG 

3551 AATTTTCCGC CACGCTCAAC AGCGTTTTCG CCGTACAGGA CGAATTAGAC 

3601 CGCGTATTTG CCGAAGACCG CCGCAACGCC GTTTGGACAA GCGGCATCCG 

3651 GGACACCAAA CACTACCGTT CGCAAGATTT CCGCGCCTAC CGCCAACAAA 

3701 CCGACCTGCG CCAAATCGGT ATGCAGAAAA ACCTCGGCAG CGGGCGCGTC 

15 3751 GGCATCCTGT TTTCGCACAA CCGGACCGAA AACACCTTCG ACGACGGCAT 

.3801 CGGCAACTCG GCACGGCTTG CCCACGGCGC CGTTTTCGGG CAATACGGCA 

3 851 TCGACAGGTT CTACATCGGC ATCAGCGCGG GCGCGGGTTT TAGCAGCGGC 

3 901 AGCCTTTCAG ACGGCATCGG AGGCAAAATC CGCCGCCGCG TGCTGCATTA 
3951 CGGCATTCAG GCACGATACC GCGCCGGTTT CGGCGGATTC GGCATCGAAC 

20 4001 CGCACATCGG CGCAACGCGC TATTTCGTCC AAAAAGCGGA TTACCGCTAC 

4 051 GAAAACGTCA ATATCGCCAC CCCCGGCCTT GCATTCAACC GCTACCGCGC 
4101 GGGCATTAAG GCAGATTATT CATTCAAACC GGCGCAACAC ATTTCCATCA 
4151 CGCCTTATTT GAGCCTGTCC TATACCGATG CCGCTTCGGG CAAAGTCCGA 
42 01 ACACGCGTCA ATACCGCCGT ATTGGCTCAG GATTTCGGCA AAACCCGCAG 

25 42 51 TGCGGAATGG GGCGTAAACG CCGAAATCAA AGGTTTCACG CTGTCCCTCC 

4 301 ACGCTGCCGC CGCCAAAGGC CCGCAACTGG AAGCGCAACA CAGCGCGGGC 

4 351 ATCAAATTAG GCTACCGCTG GTAA 

This corresponds to the amino acid sequence [<SEQ ID 650; ORFl-l>] (SEP ID NO: 650; ORF1- 
30 1): 



1 MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGIL PQA WAGHTYFGIN 

51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSWSRNG 

101 VAALVGDQYI VSVAHNGGYN NVDFGAEGRN PDQHRFTYKI VKRNNY KAGT 

151 KGHPYGGDYH MPRLHKFVTD AEPVEMTSYM DGRKYIDQNN YPDRVRIGAG 

35 201 RQYWRSDEDE PNNRESSYHI ASAYSWLVGG NTFAQNGSGG GTVNLGSEKI 

251 KHSPYGFLPT GGS FGDSGSP MFIYDAQKQK WLINGVLQTG NPYIGKSNGF 

3 01 QLVRKDWFYD E I FAGDTHS V FYEPRQNGKY SFNDDNNGTG KINAKHEHNS 

351 LPNRLKTRTV QLFNVSLSET AREPVYHAAG GVNSYRPRLN NGENISFIDE 

401 GKGELILTSN INQGAGGLYF QGDFTVSPEN NETWQGAGVH ISEDSTVTWK 

40 451 VNGVANDRLS KIGKGTLHVQ AKGENQGSIS VGDGTVILDQ QADDKGKKQA 

501 FSEIGLVSGR GTVQLNADNQ FNPDKLYFGF RGGRLDLNGH SLSFHRIQNT 

551 DEGAMIVNHN QDKESTVTIT GNKDIATTGN NNSLDSKKEI AYNGWFGEKD 

601 TTKTNGRLNL VYQPAAEDRT LLLSGGTNLN GNITQTNGKL FFSGRPTPHA 

651 YNHLNDHWSQ KEGIPRGEIV WDNDWINRTF KAENFQIKGG QAWSRNVAK 

45 701 VKGDWHLSNH AQAVFGVAPH QSHTICTRSD WTGLTNCVEK TITDDKVIAS 

751 LTKTDISGNV DLADHAHLNL TGLATLNGNL SANGDTRYTV SHNATQNGNL 

801 SLVGNAQATF NQATLNGNTS ASGNASFNLS DHAVQNGSLT LSGNAKANVS 

851 HSALNGNVSL ADKAVFHFES SRFTGQISGG KDTALHLKDS EWTLPSGTEL 

901 GNLNLDNAT I TLNSAYRHDA AGAQTGSATD APRRRSRRSR RSLLSVTPPT 

50 951 SVESRFNTLT VNGKLNGQGT FRFMSELFGY RSDKLKLAES SEGTYTLAVN 

1001 NTGNEPASLE QLTWEGKDN KPLSENLNFT LQNEHVDAGA WRYQLIRKDG 

1051 EFRLHNPVKE QELSDKLGKA EAKKQAEKDN AQSLDALIAA GRDAVEKTES 

1101 VAEPARQAGG ENVGIMQAEE EKKRVQADKD TALAKQREAE TRPATTAFPR 

1151 ARRARRDLPQ LQPQPQPQPQ RDLISRYANS GLSEFSATLN SVFAVQDELD 

55 1201 RVFAEDRRNA VWTSGIRDTK HYRSQDFRAY RQQTDLRQIG MQKNLGSGRV 

1251 GILFSHNRTE NTFDDGIGNS ARLAHGAVFG QYGIDRFYIG ISAGAGFSSG 

13 01 SLSDGIGGKI RRRVLHYGIQ ARYRAGFGGF GIEPHIGATR YFVQKADYRY 

1351 ENVNIATPGL AFNRYRAGIK ADYSFKPAQH ISITPYLSLS YTDAASGKVR 



CHIR-0160 (356.001) 



-468- 



PATENT 



14 01 TRVNTAVLAQ DFGKTRSAEW GVNAEIKGFT LSLHAAAAKG PQLEAQHSAG 
14 51 IKLGYRW* 

Computer analysis of these sequences gave the following results: 

Homology with a predicted ORF from N. meningitidis (strain A) 

5 ORF1 (SEP ID NO: 648) shows 57.8% identity over a 1456aa overlap with an ORF (ORF la) 
(SEP ID NO: 652) from strain A of N. meningitidis: 

10 20 30 40 50 60 

or f 1 . pep MKTTDKRTTETHRKAPKTGR IRFXAAYLAICLSFGIL PQAWAGHTYFGINYQYYRDFAEN 

I M 1 1 1 II 1 1 M 1 1 II II M 1 1 . 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M I 

10 or f la MKTTDKRTTETHRKAPKTGR IRFSPAYLAICLSFGIL PQAWAGHTYFGINYQYYRDFAEN 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 1 . pep KGKFAVGAKD I EVYNKKGELVGKSMTKAPMIDFS WSRNGVAALVGVQYI VSVAHNGGYN 

I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I i I lllllllllllil 
15 orf la KGKFAVGAKDI EVYNKKGELVGKSMTKAPMIDFS WSRNGVAALVGDQYI VSVAHNGGYN 

70 80 90 100 110 120 

130 140 150 160 170 180 

or f 1 . pep NVDFGAEGXNIXDQXRXTYKI VKRNNYKAGTKGHPYGGDYHMPRLHKXVTDAEPVEMTSY 

MINIM II I MMMM II :: MM MUM IMMIIMM 

20 orf la NVDFGAEGXN- PDQHRFSYQI VKRNNYKPDNS -HPYNGDXHMPRLHKFVTDAEPVEMTSD 

130 140 150 . 160 170 

190 200 210 
orf 1 . pep MDGRKY IDQNNYPDRVRI GAGRQ YWRSDEDEP NN 

I I I M -I : M 1 1 MM -1 1 1 hh II 

25 or f la MRGNTYSDKEKYPERVRIGSGHHYWRYDDDKHGDLS YSGAWLIGGNTHMQGWGNNGVXSL 

180 190 200 210 220 230 

220 230 240 250 260 

orf 1. pep RESSYH IA SGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGFQLVRK 

|::: : II MMMM -MM MUM || h MUM 

30 or f 1 a SGDVRHAND YGPMP I AGAAGDSGS PMFI YDKTNNKWLLNGVLQTG Y P YSGRENGFQL I RK 

240 250 260 270 280 290 

270 280 290 300 310 320 

orf 1 . pep DWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRTVQLFNV 

I I II hh II I hi :||hlh:|h::|llll := =h I I Ml-lhlh 
35 orf la DWFYDDI YRGDTHTVXFEPRSNGHFSFTSNNNGTGTVTETNEKVSNP - KLKVQTVRLFDE 

300 310 320 330 340 350 

330 340 350 360 370 380 

orf 1 . pep SLSETAREPVYHAAGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLYFQGDFT 

Ml Mill : 1 1 1 1 M I M M 1 1 1 1 Ml 1 1 1 MM 1 1 MM 1 1 1 Ml 1 1 1 M I II 

40 orf la SLNETDKEPVY-AAGGVNQYRPRLNNGENLSFIDYGNGKLILSNNINQGAGGLYFEGDFT 

360 370 380 390 400 410 
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390 400 410 420 430 
orf 1 . pep VS PENNETWQGAGVH I S EDSTVTWKVNGVANDRLS KI GKGTL 

MIIIIIMilMIMIIIIIIIIIII IIIIIMI Mill 

or f 1 a VS P ENNETWQGAGVH I S EDSTVTWKVNGVANDRLS K I GKGTLHVQAKGENQGS I SVGDGT 

420 430 . 440 450 460 470 



orf 1 .pep 



orf la VILDQQADDKGKKQAFSEIGLXSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNGHSLSFH 
10 480 490 500 510 520 530 



orf 1 .pep 



orf la RIQNTDEGAMIXXHNATTTSTVTITGNESITQPSGKNINRLNYSKEIAYNGWFGEKDTTK 
15 540 550 560 570 580 590 



orf 1 .pep 



orf la TNGRLNLVYQPAAEDRTXLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSGWSKMEG 
20 600 610 620 630 640 650 



25 



orf 1 .pep 
orf la 



IPQGEIVWDNDWIXRTFKAENFHIQGGQAVISRNVAKVEGDXHLSNHAQAVFGVAPHQSH 
660 670 680 690 700 710 



30 



440 450 460 470 480 

orf 1 . pep XXXXXDKVTASLTKTD I SGNVDLADHAHLNLTGLATLNGNLSAN 

II : III I II MM II I I hi hi MUM 

orf la TICTRSDWTGLTNCVEXXITDDKVIASLTKTDXSGXVXLXXXXXXXLXGXAXLXGNLSAN 

720 730 740 750 760 770 



35 



40 



45 



490 500 510 520 530 540 

orf 1 . pep GDTRYTVSHNATQNGNXSLVXNAQATFNQATLNGNTSASGNASFNLSDHAVQNGSLTLSG 

Illlll IIIMIII III lllllllllllllhl III I I I I I M M M I I I II I 
orf la GDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNXSXSGNASFNLSNNAAQNGSLTLSD 
780 790 800 810 820 830 

550 560 570 '580 590 600 

orf 1 . pep NAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSGXELGNL 

II I M I M II M 1 1 1 M I i II 1 1 iM 1 1 1 1 MM M Mll.lllill llhlll 

orf la NAKANVSHSALNGNVSLADKAVFHFENSRFTGQLSGSKXTALHLKDSEWTLPSGTELGNL 
840 850 860 870 880 890 

610 620 630 640 650 660 

orf 1 . pep NLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLXVTPPTSVESRFNTLTVNG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -hllllllll II 1 1 M 1 1 1 M 1 1 II 1 1 1 1 

or f la NLDNATITLNSAYRHDAAGAQTGXVSDTPRRRSRRS - - - LLSVTPPTSVESRFNTLTVNG 

900 910 920 930 940 950 



50 



670 680 690 700 710 720 

orf 1 . pep KIjNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTVVEGKDNKPL 

ill I mi 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h I h 1 1 1 : 1 1 1 ii 1 1 1 1 

orf la KLNXQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPVSLDQLTWEGKDNKPL 
960 970 980 990 1000 1010 
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730 740 750 

orf 1 . pep SENLNFTLQNEHVDAGAW 

I I I I I I M M I I I I I I I I 

orf la SENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAEAKKQAEKDNAQS 
1020 1030 1040 1050 1060 1070 



orf 1 .pep 



orf la LDALIAAGRDAAEKTESVAEPARXAGGENVGIMQAEEEKKRVQADKDSALAKQREAETRP 
10 1080 1090 1100 1110 1120 1130 

760 

orfl.pep LDR 

III 

orf la XTTAFPRARXARRDLPQPQPQPQPQPQPQRDLXSRYANSGLSEFSATLNSVFAVQDELDR 
15 1140 1150 1160 1170 1180 1190 

770 780 790 800 810 820 

orf 1 . pep VFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFSHNRTEN 

MM Ml II MM II 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 II I ;l M II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 

orf la VFAEDRRNAVWTSXIRXTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFSHNRTEN 
20 1200 1210 1220 1230 1240 1250 

830 840 850 860 870 880 

orf 1 . pep TFDDGIGNSARLAHGAVFGQYGIDRFYIGISAGAGFSSGSLSDGIGXKXRRRVLHYGIQA 

M 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 II 1 1 II MMIIIII Mill I MIIMIMM 

orf la XFDDGIGNSARLAHGAVFGQYGIGRFDIGISTGAGFSSGXLSDGIGGKIRRRVLHYGIQA 
25 1260 1270 1280 1290 1300 1310 

890 900 910 920 930 940 

orf 1 .pep RYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSFKPAQHI 

IIIMIIIIIIIMIIIIIIII lllllll IIIIIIMIIIMIIIIIIIIII III 

orf la RYRAGFGGFGIEPYIGATRYFVQKADYRYENWIATPGLAFNRYRAGIKADYSFKPAQHX 
30 1320 1330 1340 1350 1360 1370 

950 960 970 980 990 ' 1000 

orf 1 . pep S ITPYLSLS YTDAASGKVRTRVNTAVLAQDFGKTRS AEWGVNAE I KGFTLSLHAAAAKGP 

Mill IIIIIIIIMIIIIII IIIMII illlllllllMIIUII MM MM 

or f 1 a S I TPYXSLS YTDAASGKVRTRVNTAVLAQDFGKTRS AEWGVNAE I KGFTLSXHAAAAKGP 

35 1380 1390 1400 1410 1420 1430 

1010 1020 
orf 1 . pep QLEAQHS AG I KLGYRWX 

MMIMIMMIMM 
orfla QLEAQHSAG I KLGYRWX 

40 1440 1450 

The complete length ORF la nucleotide sequence [<SEQ ID 65 1>] (SEP ID NO: 651) is: 

1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCGAA 

51 AACCGGCCGC ATCCGCTTCT CGCCTGCTTA CTTAGCCATA TGCCTGTCGT 

45 101 TCGGCATTCT TCCCCAAGCT TGGGCGGGAC ACACTTATTT CGGCATCAAC 

151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG 

201 GGCGAAAGAT ATTGAGGTNT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT 

251 CAATGACAAA AGCCCCGATG ATTGATTTTT CTGTGGTGTC GCGTAACGGC 

3 01 GTGGCGGCAT TGGTGGGCGA TCAATATATT GTGAGCGTGG CACATAACGG 
50 '3 51 CGGCTATAAC AACGTTGATT TTGGTGCGGA AGGAAGNAAT CCCGATCAGC 

4 01 ACCGTTTTTC TTACCAAATT GTGAAAAGAA ATAATTATAA GCCTGACAAT 
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451 TCACACCCTT ACAACGGCGA TTANCATATG CCGCGTTTGC ATAAATTTGT 

501 CACAGATGCA GAACCTGTCG AAATGACGAG TGACATGAGG GGGAATACCT 

551 ATTCCGATAA AGAAAAATAT CCCGAGCGTG TCCGCATCGG CTCAGGACAC 

601 CACTATTGGC GTTATGATGA TGACAAACAC GGCGATTTAT CCTACTCCGG 

5 651 CGCATGGTTA ATTGGCGGCA ATACACATAT GCAGGGTTGG GGAAATAATG 

701 GCGTANTTAG TTTGAGCGGC GATGTGCGCC ATGCCAACGA CTATGGCCCT 

751 ATGCCGATTG CAGGTGCGGC AGGCGACAGC GGTTCGCCAA TGTTTATTTA 

801 TGACAAAACA AACAATAAAT GGCTGCTCAA CGGAGTTTTA CAAACCGGCT 

851 ACCCTTATTC CGGCAGGGAA AACGGTTTCC AGCTGATACG CAAAGATTGG 

10 901 TTCTACGATG ACATTTACAG AGGCGATACA CATACCGTCT NTTTTGAACC 

951 GCGCAGTAAC GGACATTTTT CCTTTACATC CAACAACAAC GGTACGGGTA 

1001 CGGTAACAGA AACCAACGAA AAGGTNTCCA ATCCAAAGCT TAAAGTACAG 

1051 ACAGTCCGAC TGTTTGACGA ATCTTTGAAT GAAACTGATA AAGAACCAGT 

1101 TTACGCGGCA GGGGGTGTTA ATCAGTACCG TCCAAGGTTA AACAACGGTG 

15 1151 AAAACCTTTC TTTTATCGAT TACGGCAACG GCAAACTCAT CTTATCAAAC 

1201 AACATCAACC AAGGCGCGGG CGGTTTGTAT TTTGAAGGTG ATTTTACGGT 

1251 CTCGCCTGAA AACAACGAAA CGTGGCAAGG CGCGGGCGTT CATATCAGTG 

1301 AAGACAGTAC CGTTACTTGG AAAGTAAACG GCGTGGCAAA CGACCGCCTG 

1351 TCCAAAATCG GCAAAGGCAC GCTGCACGTT CAAGCCAAAG GGGAAAACCA 

20 1401 AGGCTCGATC AGCGTGGGCG ACGGTACAGT CATTTTGGAT CAGCAGGCAG 

1451 ACGATAAAGG CAAAAAACAA GCCTTTAGTG AAATCGGCTT GNTCAGCGGC 

1501 AGGGGTACGG TGCAACTGAA TGCCGATAAT CAGTTCAACC CCGACAAACT 

1551 CTATTTCGGC TTTCGCGGCG GACGTTTGGA TTTAAACGGG CATTCGCTTT 

1601 CGTTCCACCG TATTCAAAAT ACCGATGAAG GGGCGATGAT TGNCNATCAT 

25 1651 AATGCCACAA CAACATCCAC CGTTACCATT ACAGGGAATG AAAGTATTAC 

1701 ACAACCGAGT GGTAAGAATA TCAATAGACT TAATTACAGC AAAGAAATTG 

1751 CCTACAACGG TTGGTTTGGC GAGAAAGATA CGACCAAAAC GAACGGGCGG 

1801 CTCAACCTTG TTTACCAGCC CGCCGCAGAA GACCGCACCC NGCTGCTTTC 

1851 CGGCGGAACA AATTTAAACG GCAACATCAC GCAAACAAAC GGCAAACTGT 

30 1901 TTTTCAGCGG CAGACCGACA CCGCACGCCT ACAATCATTT AGGAAGCGGG 

1951 TGGTCAAAAA TGGAAGGTAT CCCACAAGGA GAAATCGTGT GGGACAACGA 

2001 CTGGATCNAC CGCACGTTTA AAGCGGAAAA TTTCCATATT CAGGGCGGGC 

2051 AGGCGGTGAT TTCCCGCAAT GTTGCCAAAG TGGAAGGCGA TTGNCATTTG 

2101 AGCAATCACG CCCAAGCAGT TTTTGGTGTC GCACCGCATC AAAGCCATAC 

35 2151 AATCTGTACA CGTTCGGACT GGACNGGTCT GACAAATTGT GTCGAANAAA 

2201 NCATTACCGA CGATAAAGTG ATTGCTTCAT TGACTAAGAC NGACNTNAGC 

2251 GGCANTGTNA GNCTNNCCNA TNACGNTNNT TNAAANCTCN CNGGGCNTGC 

2301 NNCACTNAAN GGCAATCTTA GTGCAAATGG CGATACACGT TATACAGTCA 

2351 GCCACAACGC CACCCAAAAC GGCAACCTTA GCCTCGTGGG CAATGCCCAA 

40 2401 GCAACATTTA ATCAAGCCAC ATTAAACGGC AACNCATCGG NTTCGGGCAA 

2451 TGCTTCATTT AATCTAAGCA ACAACGCCGC ACAAAACGGC AGTCTGACGC 

2 501 TTTCCGACAA CGCTAAGGCA AACGTAAGCC ATTCCGCACT CAACGGCAAT 

2 551 GTCTCCCTAG CCGATAAGGC AGTATTCCAT TTTGAAAACA GCCGCTTTAC 

2601 CGGACAACTC AGCGGCAGCA AGGANACAGC ATTACACTTA AAAGACAGCG 

45 2651 AATGGACGCT GCCGTCAGGC ACGGAATTAG GCAATTTAAA CCTTGACAAC 

2701 GCCACCATTA CACTCAATTC CGCCTATCGC CACGATGCTG CAGGCGCGCA 

2 751 AACCGGCAGN GTGTCAGACA CGCCGCGCCG CCGTTCGCGC CGTTCCCTAT 

2 801 TATCCGTTAC ACCGCCAACT TCGGTAGAAT CCCGTTTCAA CACGCTGACG 

2851 GTAAACGGCA AATTGAACNG TCAAGGAACA TTCCGCTTTA TGTCGGAACT 

50 2 901 CTTCGGCTAC CGAAGCGACA AATTGAAGCT GGCGGAAAGT TCCGAAGGNA 

2 951 CTTACACCTT GGCGGTCAAC AATACCGGCA ACGAACCCGT AAGCCTCGAT 

- 3001 CAATTGACGG TAGTGGAAGG GAAAGACAAC AAACCGCTGT CCGAAAACCT 

3051 TAATTTCACC CTGCAAAACG AACACGTCGA TGCCGGCGCG TGGCGTTACC 

3101 AACTCATCCG CAAAGACGGC GAGTTCCGCC TGCATAATCC GGTCAAAGAA 

55 3151 CAAGAGCTTT CCGACAAACT CGGCAAGGCA GAAGCCAAAA AACAGGCGGA 

32 01 AAAAGACAAC GCGCAAAGCC TTGACGCGCT GATTGCGGCC GGGCGCGATG 
3251 CCGCCGAAAA GACAGAAAGC GTTGCCGAAC CGGCCCGGCN GGCAGGCGGG 
3301 GAAAATGTCG GCATTATGCA GGCGGAGGAA GAGAAAAAAC GGGTGCAGGC 

33 51 GGATAAAGAC AGCGCNTTGG CGAAACAGCG CGAAGCGGAA ACCCGGCCGG 
60 3401 NTACCACCGC CTTCCCCCGC GCCCGCNGCG CCCGCCGGGA TTTGCCGCAA 

3451 CCGCAGCCCC AACCGCAACC TCAACCCCAA CCGCAGCGCG ACCTGATNAG 
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3 501 CCGTTATGCC AATAGCGGTT TGAGTGAATT TTCCGCCACG CTCAACAGCG 

3551 TTTTCGCCGT ACAGGACGAA TTGGACCGCG TGTTTGCCGA AGACCGCCGC 

3601 AACGCNGTTT GGACAAGCNG CATCCGGNAC ACCAAACACT ACCGTTCGCA 

3651 AGATTTCCGC GCCTACCGCC AACAAACCGA CCTGCGCCAA ATCGGTATGC 

5 3 701 AGAAAAACCT CGGCAGCGGG CGCGTCGGCA TCCTGTTTTC GCACAACCGG 

3751 ACCGAAAACA NCTTCGACGA CGGCATCGGC AACTCGGCAC GGCTTGCCCA 

3 801 CGGCGCCGTT TTCGGGCAAT ACGGCATCGG CAGGTTCGAC ATCGGCATCA 

3 851 GCACGGGCGC GGGTTTTAGC AGCGGCANTC TNTCAGACGG CATCGGAGGC 

3 901 AAAATCCGCC GCCGCGTGCT GCATTACGGC ATTCAGGCAC GATACCGCGC 
10 3 951 CGGTTTCGGC GGATTCGGCA TCGAACCGTA CATCGGCGCA ACGCGCTATT 

4 001 TCGTCCAAAA AGCGGATTAC CGCTACGAAA ACGTCAATAT CGCCACCCCC 
4 051 GGTCTTGCGT TCAACCGNTA CCGNGCGGGC ATTAAGGCAG ATTATTCATT 
4101 CAAACCGGCG CAACACATNT CCATCACNCC TTATTTNAGC CTGTCCTATA 
4151 CCGATGCCGC TTCGGGCAAA GTCCGAACAC GCGTCAATAC CGCNGTATTG 

15 4201 GCTCAGGATT TCGGCAAAAC CCGCAGTGCG GAATGGGGCG TAAACGCCGA 

42 51 AATCAAAGGT TTCACGCTQT CCNTCCACGC TGCCGCCGCC AAAGGNCCGC 

43 01 AACTGGAAGC GCAACACAGC GCGGGCATCA AATTAGGCTA CCGCTGGTAA 

This encodes a protein having amino acid sequence [<SEQ ED 652>] (SEP ID NO: 652) : 



20 1 MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGIL PQA WAGHTYFGIN 

51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSWSRNG 

101 VAALVGDQYI VSVAHNGGYN NVDFGAEGXN PDQHRFSYQI VKRNNYKPDN 

151 SHPYNGDXHM PRLHKFVTDA EPVEMTSDMR GNTYSDKEKY PERVRIGSGH 

2 01 HYWRYDDDKH GDLSYSGAWL IGGNTHMQGW GNNGVXSLSG DVRHANDYGP 
25 2 51 MPIAGAAGDS GSPMFIYDKT NNKWLLNGVL QTGYPYSGRE NGFQLIRKDW 

301 FYDDIYRGDT HTVXFEPRSN GHFSFTSNNN GTGTVTETNE KVSNPKLKVQ 

3 51 TVRLFDESLN ETDKEPVYAA GGVNQYRPRL NNGENLSFID YGNGKLILSN 

4 01 NINQGAGGLY FEGDFTVSPE NNETWQGAGV HISEDSTVTW KVNGVANDRL ( 
451 SKIGKGTLHV QAKGENQGSI SVGDGTVILD QQADDKGKKQ AFSEIGLXSG 

30 501 RGTVQLNADN QFNPDKLYFG FRGGRLDLNG HSLSFHRIQN TDEGAMIXXH 

551 NATTTSTVTI TGNESITQPS GKNINRLNYS KEIAYNGWFG EKDTTKTNGR 

601 LNLVYQPAAE DRTXLLSGGT NLNGNITQTN GKLFFSGRPT PHAYNHLGSG 

651 WSKMEGIPQG EIVWDNDWIX RTFKAENFHI QGGQAVISRN VAKVEGDXHL 

701 SNHAQAVFGV APHQSHTICT RSDWTGLTNC VEXXITDDKV IASLTKTDXS 

35 751 GXVXLXXXXX XXLXGXAXLX GNLSANGDTR YTVSHNATQN GNLSLVGNAQ 

801 ATFNQATLNG NXSXSGNASF NLSNNAAQNG SLTLSDNAKA NVSHSALNGN 

851 VSLADKAVFH FENSRFTGQL SGSKXTALHL KDSEWTLPSG TELGNLNLDN 

901 ATITLNSAYR HDAAGAQTGX VSDTPRRRSR RSLLSVTPPT SVESRFNTLT 

951 VNGKLNXQGT FRFMSELFGY RSDKLKLAES SEGTYTLAVN NTGNEPVSLD 

40 1001 QLTWEGKDN KPLSENLNFT LQNEHVDAGA WRYQLIRKDG EFRLHNPVKE 

1051 QELSDKLGKA EAKKQAEKDN AQSLDALIAA GRDAAEKTES VAEPARXAGG 

1101 ENVGIMQAEE EKKRVQADKD SALAKQREAE TRPXTTAFPR ARXARRDLPQ 

1151 PQPQPQPQPQ PQRDLXSRYA NSGLSEFSAT LNSVFAVQDE LDRVFAEDRR 

1201 NAVWTSXIRX TKHYRSQDFR AYRQQTDLRQ IGMQKNLGSG RVGILFSHNR 

45 1251 TENXFDDGIG NSARLAHGAV FGQYGIGRFD IGISTGAGFS SGXLSDGIGG 

1301 KIRRRVLHYG IQARYRAGFG GFGIEPYIGA TRYFVQKADY RYENVNIATP 

1351 GLAFNRYRAG IKADYSFKPA QHXSITPYXS LSYTDAASGK VRTRVNTAVL 

14 01 AQDFGKTRSA EWGVNAE I KG FTLSXHAAAA KGPQLEAQHS AGIKLGYRW* 

50 A transmembrane region is underlined. 

ORF1-1 (SEP ID NO: 650) shows 86.3% identity over a 1462aa overlap with ORFla (SEP ID 



NP: 652) : 
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10 20 30 40 50 60 

orf la . pep MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN 

M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M I III 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 

orf 1-1 MKTTDKRTTETHRKAPKTGR I RFS PAYLA I CLS FGI LPQAWAGHT YFG INYQYYRDFAEN 

5 10 20 30 40 50 60 

70 80 90 100 110 120 

orf la . pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSWSRNGVAALVGDQYIVSVAHNGGYN 

1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 > 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 1-1 KGKFAVGAKDI EVYNKKGELVGKSMTKAPM I DFS WSRNGVAALVGDQY I VS VAHNGGYN 

10 70 80 90 100 110 120 

130 140 150 160 170 179 

orf la . pep NVDFGAEGXNPDQHRFSYQIVKRNNYKPDNS-HPYNGDXHMPRLHKFVTDAEPVEMTSDM 

MINIM I I I I I I I = I : I I M I I I I :: 111 = 11 II I I I I I II I I I I I I I I I I I 

or f 1 - 1 NVDFGAEGRNPDQHRFTYKI VKRNNYKAGTKGHP YGGDYHMPRLHKFVTDAEPVEMTS YM 

15 130 140 150 160 170 180 

180 190 200 210 220 230 
orf la . pep RGNTYSDKEKYPERVRIGSGHHYWRYDDDKHGDL- - SYSGA WLIGGNTHMQGWGNN 

I I |:::|hllMhh:|M hh :: II I I I : I I I I h =:: 

orf 1-1 DGRKYIDQNNYPDRVRIGAGRQYWRSDEDEPNNRESS YHI ASAYSWLVGGNTFAQNGSGG 

20 190 200 210 220 230 240 

240 250 260 270 280 290 

orf la . pep GVXSLSGD- VRHANDYGPMPIAGAAGDSGSPMFI YDKTNNKWLLNGVLQTGYPYSGRENG 

i :|::: : = | = || I :|: lllllllllll - I I I = I I I I I I I II I I I 
orf 1-1 GTVNLGSEKI KHS - P YGFLPTGGS FGDSGS PMF I YDAQKQKWL INGVLQTGNP Y I GKSNG 

25 250 260 270 280 290 

300 310 320 330 340 350 

orf la . pep FQLIRKDWFYDDIYRGDTHTVXFEPRSNGHFSFTSNNNGTGTVTETNEKVSNP-KLKVQT 
MMIMMMM 'llh :|||:||::||:::||||| :: I I I :||::| 
orf 1-1 FQLVRKDWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRT 
30 300 310 320 330 340 350 

360 ' 370 380 390 400 410 

orf la . pep VRLFDESLNETDKEP VY - AAGGVNQYRPRLNNGENLS F I DYGNGKL I LSNN INQGAGGLY 

hlh 11 = 11 Hill I I I I II = I I I I I I I I I h I I I I h h II h = I I I I I I I I I I 
orf 1-1 VQLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLY 

35 360 370 380 390 400 410 

420 430 440 450 460 470 

orf la . pep FEGDFTVSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGSI 
h I I I I I I I I I I I I I I I I I I I I M I II I I II I I I I I I I M I I I I I I I I I I I I II I I II 
orf 1-1 FQGDFTVSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGSI 
40 420 430 440 450 460 470 

480 490 500 510 520 530 

orf la . pep SVGDGTVILDQQADDKGKKQAFSEIGLXSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 

orf 1 - 1 SVGDGTVILDQQADDKGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNG 
45 480 490 500 510 520 530 

540 550 560 570 580 590 

orf la. pep HSLSFHRIQNTDEGAMIXXHNATTTSTVTITGNESITQPSGKNINRLNYSKEIAYNGWFG 

llllllllllll Ml II I I I 1 I I I I : : I ^ :hl I I MMIIIMM 

orf 1-1 HS LS FHR I QNTDEGAM I VNHNQDKESTVT I TGNKD I AT - TGNN - NS LDS KKE I AYNGWFG 

50 540 550 560 570 580 590 
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600 610 620 630 640 650 

orf la . pep EKDTTKTNGRLNLVYQPAAEDRTXLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSG 

III MM III I INI INI MM II I III I III MM I II I Ml 1 1 MINI MM:: 

orf 1-1 EKDTTKTNGRLNLVYQPAAEDRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLNDH 
5 600 610 620 630 640 650 

660 670 680 690 700 710 

orf la . pep WSKMEGIPQGEIVWDNDWIXRTFKAENFHIQGGQAVISRNVAKVEGDXHLSNHAQAVFGV 

I I II hi I II II II II M M M I M I M M I M 1 1 II M M 1 1 MINI II MM 

orf 1-1 WSQKEGI PRGEIVWDNDWINRTFKAENFQIKGGQAVVSRNVAKVKGDWHLSNHAQAVFGV 

10 660 670 680 690 700 710 

720 730 740 750 760 770 

orf la . pep APHQSHTICTRSDWTGLTNCVEXXITDDKVIASLTKTDXSGXVXLXXXXXXXLXGXAXLX 

I I I I I M I I I I II I I I I I I I : I I I I I I I I I I I I I I || I I hi hi 

orf 1-1 APHQSHTICTRSDWTGLTNCVEKTITDDKVIASLTKTDISGNVDLADHAHLNLTGLATLN 
15 720 730 740 750 760 770 

780 790 800 810 820 830 

orf la . pep GNLSANGDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNXSXSGNASFNLSNNAAQNG 

II I II I II I II 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II II 1 1 1 h I MIMMIhMMM 

orf 1-1 GNLSANGDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNTSASGNASFNLSDHAVQNG 
20 780 790 800 810 820 830 

840 850 860 870 880 890 

orf la . pep SLTLSDNAKANVSHS ALNGNVS LADKAVFHFENSRFTGQLSGSKXTALHLKDSEWTLPSG 

Mill 1 1 1 1 1 i 1 1 1 1 1 1 1 U 1 1 1 1 1 1 1 1 M 1 1 1 1 M I : i 1 1 1 1 1 1 1 1 1 1 1 M I 

orf 1-1 SLTLSGNAKANVSHS ALNGNVS LAD KAVFHFESSRFTGQ I SGGKDTALHLKDSEWTLPSG 

25 840 850 860 870 880 890 

900 910 920 930 940 
orf la .pep TELGNLNLDNATITLNSAYRHDAAGAQTGXVSDTPRRRSRRS LLSVTPPTSVESRFN 

Ml I I I I I I I I I I I I I I I I I I I I I I I :M: III I III I I II I! I I I I I I M II 
orf 1-1 TELGNLNLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLSVTPPTSVESRFN 
30 900 910 920 930 940 950 

950 960 970 980 990 1000 

or f la . pep TLTVNGKLNXQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPVSLDQLTWEG 

1 1 1 1 1 1 1 1 1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 I II 1 1 II I 1 1 h I h 1 1 1 1 1 II 

or f 1 - 1 TLTVNGKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTWEG 
35 960 970 980 990 1000 1010 

1010 1020 1030 1040 1050 1060 

orf la . pep KDNKPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAEAKKQAE 

I I I I I I I I I I I I I I I I I I I II I I I I II II I I I I I I I I II I II II II M I I I I I I I I II I I 
orf 1-1 KDNKPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAEAKKQAE 

40 1020 1030 1040 1050 1060 1070 

1070 1080 1090 1100 1110 1120 

or f la . pep KDNAQSLDALI AAGRDAAEKTESVAEPARXAGGENVGIMQAEEEKKRVQADKDSALAKQR 

I I I I I I I I I I I I I I I I I : I I I I I I I I I I I 1 I I I I I J I I I I I I I I I I J I 1 I t I = I I 1 I I I 
or f 1 - 1 KDNAQ S LDAL I AAGRD AVEKTES VAE PARQAGGENVG I MQAE EE KfCRVQAD KDTALAKQR 

45 1080 1090 1100 1110 1120 1130 

. 1130 1140 1150 1160 1170 1180 

orf la . pep EAETRPXTTAFPRARXARRDLPQPQPQPQPQPQPQRDLXSRYANSGLSEFSATLNSVFAV 

Mill 1 1 1 1 1 1 1 1 Mill 1 1 1 1 1 1 1 1 MM I IIIIIIIIIIMIIIIII 

orf 1-1 EAETRPATTAFPRARRARRDLPQLQPQPQPQP- -QRDLISRYANSGLSEFSATLNSVFAV 

50 1140 1150 1160 1170 1180 1190 
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1190 1200 1210 1220 1230 1240 

orf la . pep QDELDRVFAEDRRNAVWTSX I RXTKHYRSQDFRAYRQQTDLRQ I GMQKNLGSGRVG I LFS 

I I I I I I I I I I I I I I I II I I II ] I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 1-1 QDELDRVFAEDRRNAVWTSG I RDTKHYRSQDFRAYRQQTDLRQ I GMQKNLGSGRVG I LFS 

5 1200 1210 1220 1230 1240 1250 

1250 1260 1270 1280 1290 1300 

orf la . pep HNRTENXFDDGIGNSARLAHGAVFGQYGIGRFDIGISTGAGFSSGXLSDGIGGKIRRRVL 

Illllhllllllllllllllllllllll II lllhlllllll llllllllllllll 

orf 1- 1 HNRTENTFDDGIGNSARLAHGAVFGQYGIDRFYIGISAGAGFSSGSLSDGIGGKIRRRVL 
10 1260 1270 1280 1290 1300 1310 

1310 1320 1330 1340 1350 1360 

orf la . pep HYGIQARYRAGFGGFGIEPYIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSF 

I M 1 1 1 1 1 II III! M hi I IM I M II III 1 1 i II 1 1 1 1 II 1 1 1 1 1 1 Mil 1 1 M 

orf 1-1 HYG I Q ARYRAGFGG FG I E PH I G ATRY FVQ KAD YR YENVN I AT PGLAFNR YRAG I KADYS F 

15 1320 1330 1340 1350 1360 1370 

1370 1380 1390 1400 1410 1420 

orf la . pep KPAQHXSITPYXSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSXHA 

lllll lllll Ml IIIMIIMIHIIIMIIMIII I MIIIIIMIMI II 

orf 1-1 KPAQHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSLHA 
20 1380 1390 1400 1410 1420 1430 

1430 1440 1450 

orf la . pep AAAKGPQLEAQHSAGI KLGYRWX 

Illllllllllllllllllllll 
orfl-1 AAAKGPQLEAQHSAGI KLGYRWX 

25 1440 1450 

Homology with adhesion and penetration protein hap precursor of H.influenzae (accession number 
P45387) rSEOIDNO: 1153) 

Amino acids 23-423 of ORF1 (SEP ID NO: 648) show 59% aa identity with hap protein (SEP ID 
NP: 1 153) in 450aa overlap: 



30 



35 



orf 1 


23 


hap 


6 


orf 1 


83 


hap 


66 


orf 1 


143 


hap 


125 


orf 1 


203 


hap 


185 


orf 1 


223 


hap 


245 



F +L C+S GI QAWAGHTYFGI + YQYYRDFAENKGKF VGAK+IEVYNK+G+LVG 
FRLNFLTACVSLGIASQAWAGHTYFGIDYQYYRDFAENKGKFTVGAKNIEVYNKEGQLVG 6 5 



SMTKAPMIDFSWSRNGVAALVG QYIVSVAHNGGYN+VDFGAEG N DQ R TY+IV 



KRNNY+A + HPY GDYHMPRLHK VT+AEPV MT+ MDG+ Y D+ NYP+RVRIG+GR 



222 



40 QYWR+D+DE N SSY+++ 



SGSPMFIYDA+K++WLIN VLQTG+P+ G+ NGFQL+R++WFY+E+ A DT SVF 
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orfl 278 --YEPRQNGKYSFNDDNNGTGKIN-AKHEHNSLPNRLKTRTVQLFNVSLSETAREPVYHA 334 

Y P NG YSF +N+GTGK+ + + + + TV+LFN SL++TA+E V A 
hap 305 QRYIPPINGHYSFVSNNDGTGKLTLTRPSKDGSKAKSEVGTVKLFNPSLNQTAKEHV-KA 363 

orfl 335 AGGVNS YRPRLNNGEN I S F I DEGKGEL I LTSN INQGAGGLYFQGDFTV - S PENNETWQGA 393 
5 A G N Y+PR+ G+NI D+GKG L + +NINQGAGGLYF+G+F V +NN TWQGA 

hap 364 AAGYNIYQPRMEYGKNIYLGDQGKGTLTIENNINQGAGGLYFEGNFWKGKQNNITWQGA 423 

orfl 3 94 GVH I S EDSTVTWKVNGVANDRLS KI GKGTL 423 

GV I +D+TV WKV+ NDRLSKIG GTL 
hap 424 GVS I GQD AT VE WKVHN P ENDRLS K I G I GTL 453 

10 

Amino acids 715-101 1 of ORF1 (SEP ID NO: 648) show 50% aa identity with hap protein (SEP 
ID NO: 1153) in 258aa overlap: 

Orfl 41 DTRYTVSHNATQ-NGNXSLVXNAQATFNQ-ATLNGNTSASGNASFNLSDHAVQNGSLTLS 98 
15 DT+ S TQ NG+ +L NA + A LNGN + ++ F LS++A Q G+ + LS 

hap 733 DTKVINSIPITQINGSINLTNNATVNIHGLiAKLNGNVTLIDHSQFTLSNNATQTGNIKLS 792 

orfl 99 GNAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSGXELGN 158 

+A A V+++ LNGNV L D A F ++S F QI G KDT + L+++ WT+PS L N 
hap 793 NHANATVNNATLNGNVHLTDSAQFSLKNSHFWHQIQGDKDTTVTLENATWTMPSDTTLQN 852 

20 orfl 159 LNLDNATITLNSAYRHDAAGAQTGSATDAPXXXXXXXXXXLLXVTPPTSVESRFNTLTVN 218 

L L+N+T+TLNSAY + S+ +AP L T PTS E RFNTLTVN 

hap 853 LTLNNSTVTLNSAY SASSNNAPRHRRS LETETTPTSAEHRFNTLTVN 899 

orfl 219 GKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTWEGKDNKP 278 
GKL+GQGTF+F S LFGY+SDKLKL+ +EG YTL+V NTG EP +LEQLT++E DNKP 
25 hap 900 GKLSGQGTFQFTSSLFGYKSDKLKLSNDAEGDYTLSVRNTGKEPVTLEQLTLIESLDNKP 959 

orfl 279 LSENLNFTLQNEHVDAGA 296 

LS+ L FTL+N+HVDAGA 
hap 960 LSDKLKFTLENDHVDAGA 977 

30 Amino acids 1 192-1450 of ORF1 (SEP ID NO: 648) show 41% aa identity with hap protein (SEP 
. IDNP: 1153) in 259aa overlap: 

Orfl 1 LDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFSHNR 60 

LDR+F + ++AVWT+ +D + Y S FRAY+Q+T+LRQIG+QK L +GR+G +FSH+R 
hap 1135 LDRLFVDQAQS AVWTN I AQDKRRYDSDAFRAYQQKTNLRQ I GVQKALANGR I GAVFSHSR 1194 

35 orfl 61 TENT FDDG I GNS ARLAHG AVFGQ YG I DR F YXXXXXXXXXXXXXXXXX I GX KXRRR VLH YG 120 

++NTFD+ +NAL+FQY KR+ ++YG 

hap 1195 SDNTFDEQVKNHATLTMMSGFAQYQWGDLQFGVNVGTGISASKMAEEQSRKIHRKAINYG 1254 

orfl 121 IQARYRAGFGGFG I E PH IGATRYFVQKADYRYENVN I ATPGLAFNRYRAG I KADYS FKP A 180 
+ A Y+ G GI+P+ G RYF+++ +Y+ E V + TP LAFNRY AGI+ DY+F P 
40 hap 1255 VNASYQFRLGQLGIQPYFGVNRYFIERENYQSEEVRVKTPSLAFNRYNAGIRVDYTFTPT 1314 

orfl 181 QHI S I TPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRS AEWGVNAE I KGFTLSLHAAAA 240 

+IS+ PY ++Y D ++ V+T VN VL Q FG+ E G+ AEI F +S + + 
hap 1315 DN I S VKPYFFVNYVDVSNANVQTTVNLTVLQQPFGRYWQKEVGLKAE I LHFQ I S AF I S KS 1374 
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orfl 241 KGPQLEAQHSAGI KLGYRW 259 

+G QL Q + G+ KLGYRW 
hap 1375 QGSQLGKQQNVGV KLGYRW 1393 

Homology with a predicted ORF from N. gonorrhoeae 

The blocks of ORF1 (SEP ID NO: 648) show 83.5%, 88.3%, and 97.7% identities in 467, 298, and 
259 aa overlap, respectively with a predicted ORF (ORFlng) (SEP ID NO: 654) from 
N. gonorrhoeae: 



10 



15 



20 



25 



30 



35 



40 



orf 1 .pep 
orf lng 
orf 1 .pep 
orf lng 
orf 1 .pep 
orf lng 
orf 1 .pep 
orf lng 
orf 1 . pep 
orf lng 
orf 1 -pep 
orf lng 
orf 1 .pep 
orf lng 
orf 1 .pep 
orf lng 
orf 1 . pep 
orf lng 
orf 1 .pep 
orf lng 
orf 1 .pep 
orf lng 



MKTTDKRTTETHRKAPKTGRIRFXAAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN 

I Ml 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 ! I 1 1 1 1 1 II 1 1 1 1 1 1 1 1 I! Illlll Illlllll 

MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQARAGHTYFGINYQYYRDFAEN 
KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSWSRNGVAALVGVQYIVSVAHNGGYN 

MINI MM 1 1 III II MM Mill II MM I MM 1 1 Mill: I III Illlll 

KGKFAVGAKD I E VYNKKGELVGKSMTKAPM I DFSWS RNGVAALAGDQY I VS VAHNGGYN 



MDGRKY I DQNNYPDRVR I GAGRQYWRSDEDEPNNRESS YH I AS 

Ml II I MINI Illlll IMIIMIIIIIIIIIII 

MDGWKYADLNKYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSG 



VQLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLY 

IIIIIIMI I MM Ml II IIIIIIIIIIM Mill MM 1 1 1 1 1 II II I II 1 1 1 1 II 

VQLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDKGKGELILTSNINQGAGGLY 

FQGDFTVS PENNETWQGAGVH I SEDS TVTWKVNGVANDRLS KI GKGT 

I : I : I I II h I M I II I I I I I I I : I I I I II I I I I I I I I I I I I I I II 

FEGNFTVS PKNNETWQGAGVH I SDGS TVTWKVNGVANDRLS K I GKGTLLVQAKGENQGSV 

// 

DKVTASLTKTDISGNVDLADHAHLNLTGLA 

III llhllh .llhlllllllllllll 
FGVAPHQSHTICTRSDWTGLTSCTEKTITDDKVIASLSKTDVRGNVSLADHAHLNLTGLA 



60 



60 



120 



120 



180 



NVDFGAEGXNIXDQXRXTYKIVKRNNYKAGTKGHPYGGDYHMPRLHKXVTDAEPVEMTSY 

IMIIIII I II I MM II ' 1 1 1 1 II h 1 1 1 1 1 1 1 1 1 1 1 1 1 IIIIIIIIIIM " 

NVDFGAEGSN- PDQHRFS YQI VKRNNYKAGTNGHPYGGDYHMPRLHKFVTDAEPVEMTS Y 179 



223 



239 



GS PM F I YDA QKQ KWL I N GVLOTGN P Y I GKSNG 255 

I I I II II II I I I I II I I II I I II M I II II I I 

GGTVNLGSEKI KHS P Y GFLPTGGS FGDSGS PMF I YDA QKQKWL IN GVLOTGNPY I GKSNG 2 89 

FQLVRKDWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRT 315 

MTll III I III I II I Ml Mil I h Mill llhllh I Ihllhl III HUM 

FOL VRKDWFYDE I FAGDTHSVFYEPHQNGKYFFNDNNNGAGKIDAKHKHYSLPYRLKTRT 35 9 



375 



422 



479 



744 



774 



803 



TLNGNLSANGDTR- YTVSHNATQNGNXSLVXNAQATFNQATLNGNTS ASGNAS FNLSDHA 

Mill :::MI = " I M I I I I Ml MMMIMMMIMM I i I I I I I = = I 
TFNGNL-VQAETRTIRLRANATQNGNLSLVGNAQATFNQATLNGNTSASDNASFNLSNNA 833 

VQNGSLTLSGNAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWT 863 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h 1 1 1 1 h 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

VQNGSLTLSDNAKANVSHSALNGNVSLADKAVFHFENSRFTGKISGGKDTALHLKDSEWT 893 
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orf 1 .pep LPSGXELGNLNLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLXVTPPTSVE 923 

MIIMMII! 1 1 Illllll MMMMIMM MMIMMMM II lllllhl 

orflng LPSGTELGNLNLDNATITLNSAYRHDAAGAQTGSAADAPRRRSRRS LLSVTPPTSAE 950 

orf 1 .pep SRFNTLTVNGKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLT 983 

Illllll MM Mlllllll lllllllll Illlllll Mill MM MM IMIIMI 

orflng SRFNTLTVNGKLNGQGTFRFMSELFGYRSGKLKLAESSEGTYTLAVNNTGNEPVSLEQLT 1010 

orfl.pep WEGKDNKPLSENLNFTLQNEHVDAGAW 1011 

Illllll l l l h l l i l l l l l l l l l M l 

orflng WEGKDNTPLS ENLNFTLQNEHVDAGAWRYQL I RKDGEFRLHNP VKEQELSDKLGKAGET 1070 

// 

orfl pep LDRVFAEDRRNAVWTSGIRDTKHYRSQDFR 1211 

I I I I I I I I I II I I I I I I I I I I I ' I I I I 
orflng PQRDLISRYANSGLSEFSATLNSVFAVQDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFR 1239 

orf 1 .pep AYRQQTDLRQIGMQKNLGSGRVGILFSHNRTENTFDDGIGNSARLAHGAVFGQYGIDRFY 1271 

I IMIMIIII IIIIMIMI Illllll IIIIIIIIIIIIIIIMIIIIIII II 

orflng AYRQQTDLRQIGMQKNLGSGRVGILFSHNRTGNTFDDGIGNSARLAHGAVFGQYGIGRFD 12 99 

orf 1 .pep I G I S AGAGFS SGS LSDGI GXKXRRRVLHYG I QARYRAGFGGFG I EPH I GATRYFVQKAD Y 1331 

'IMIMIIII MM I 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orflng IGISAGAGFS SGS LSDG I RGKIRRRVLHYG I QARYRAGFGGFG I EPH I GATRYFVQKADY 1359 

orf 1 .pep RYENVNI ATPGLAFNRYRAGI KADYS FKPAQHI S I TPYLSLS YTDAASGKVRTRVNTAVL 1391 

I I I I ! I I I I I ; I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I ! I I I I I [ I I 

orflng RYENVNI ATPGLAFNRYRAGI KADYS FKPAQHISITPYLSLS YTDAASGKVRTRVNTAVL 1419 

orf 1 .pep AQDFGKTRS AEWGVNAE I KGFTLS LHAAAAKGPQLEAQHS AG I KLGYRW 1440 

M Ml M I Ml 1 1 II III M I IMMIMI M MM M III IMM 1 1! 

orflng AQDFGKTRSAEWGVNAE I KGFTLS LHAAAAKGPQLEAQHS AG I KLGYRW 1468 

The complete length ORFlng nucleotide sequence was identified [<SEQ ID 653>] fSEO ID NO: 
653) : 

1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCTAA 

51 AACCGGCCGC ATCCGCTTCT CGCCCGCTTA CTTAGCCATA TGCCTGTCGT 

101 TCGGCATTCT GCCCCAAGCC CGGGCGGGAC ACACTTATTT CGGCATCAAC 

151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG r 

201 GGCGAAAGAT ATTGAGGTTT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT 

2 51 CGATGACGAA AGCCCCGATG ATTGATTTTT CTGTGGTATC GCGTAACGGC 

301 GTGGCGGCAT TGGCGGGCGA TCAATATATT GTGAGCGTGG CACATAACGG 

351 CGGCTATAAC AATGTTGATT TTGGTGCGGA GGGAAGCAAT CCCGATCAGC 

4 01 ACCGCTTTTC TTACCAAATT GTGAAAAGAA ATAATTATAA AGCAGGGACT 

451 AACGGCCATC CTTATGGCGG CGATTATCAT ATGCCGCGTT TGCACAAATT 

501 TGTCACAGAT GCAGAACCTG TTGAGATGAC CAGTTATATG GATGGGTGGA 

551 AATACGCTGA TTTAAATAAA TACCCTGATC GTGTTCGAAT CGGAGCAGGC 

601 AGACAATATT GGCGGTCTGA TGAAGACGAA CCCAATAACC GCGAAAGTTC 

651 ATATCATATT GCAAGCGCAT ATTCTTGGCT CGTCGGTGGC AATACCTTTG 

701 CACAAAATGG ATCAGGTGGT GGCACAGTCA ACTTAGGTAG CGAAAAAATT 

751 AAACATAGCC CATATGGTTT TTTACCAACA GGAGGCTCAT TTGGCGACAG 

801 TGGCTCACCA ATGTTTATCT ATGATGCCCA AAAGCAAAAG TGGTTAATTA 

851 ATGGGGTATT GCAAACAGGC AACCCCTATA TAGGAAAAAG CAATGGCTTC 

901 CAGCTAGTTC GTAAAGATTG GTTCTATGAT GAAATCTTTG CTGGAGATAC 

951 CCATTCAGTA TTCTACGAAC CACATCAAAA TGGGAAATAC TTTTTTAACG 

1001 ACAATAATAA TGGCGCAGGA AAAATCGATG CCAAACATAA ACACTATTCT 

1051 CTACCTTATA GATTAAAAAC ACGAACCGTT CAATTGTTTA ATGTTTCTTT 

1101 ATCCGAGACA GCAAGAGAAC CTGTTTATCA TGCTGCAGGT GGGGTCAACA 
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1151 GTTATCGACC CAGACTGAAT AATGGAGAAA ATATTTCCTT TATTGACAAA 

1201 GGAAAAGGTG AATTGATACT TACCAGCAAC ATCAACCAAG GCGCGGGCGG 

1251 TTTGTATTTT GAGGGTAATT TTACGGTCTC GCCTAAAAAC AACGAAACGT 

1301 GGCAAGGCGC GGGCGTTCAT ATCAGTGATG GCAGTACCGT TACTTGGAAA 

5 1351 GTAAACGGCG TGGCAAACGA CCGCCTGTCC AAAATCGGCA AAGGCACGCT 

1401 GCTGGTTCAA GCCAAAGGGG AAAACCAAGG CTCGGTCAGC GTGGGCGACG 

1451 GTAAAGTCAT CTTAGATCAG CAGGCGGACG ATCAAGGCAA AAAACAAGCC 

1501 TTTAGTGAAA TCGGCTTGGT CAGCGGCAGG GGGACGGTGC AACTGAATGC 

1551 CGATAATCAG TTCAACCCCG ACAAACTCTA TTTCGGCTTT CGCGGCGGAC 

10 1601 GTTTGGATTT GAACGGGCAT TCGCTTTCGT TCCACCGCAT TCAAAATACC 

1651 GATGAAGGGG CGATGATTGT CAACCACAAT CAAGACAAAG AATCCACCGT 

1701 TACCATTACA GGCAATAAAG ATATTACTAC AACCGGCAAT AACAACAACT 

1751 TGGATAGCAA AAAAGAAATT GCCTACAACG GTTGGTTTGG CGAGAAAGAT 

1801 GCAACCAAAA CGAACGGGCG GCTCAATCTG AATTACCAAC CGGAAGAAGC 

15 1851 GGATCGCACT TTACTGCTTT CCGGCGGAAC AAATTTAAAC GGCAATATCA 

1901 CGCAAACAAA CGGCAAACTG TTTTTCAGCG GCAGACCGAC ACCGCACGCC 

1951 TACAATCATT TAGGAAGCGG GTGGTCAAAA ATGGAAGGTA TCCCACAAGG 

2001 AGAAATCGTG TGGGACAACG ATTGGATCGA CCGCACATTT AAAGCGGAAA 

2051 ACTTCCATAT TCAGGGCGGA CAAGCGGTGG TTTCCCGCAA TGTTGCCAAA 

20 2101 GTGGAAGGCG ATTGGCATTT AAGCAATCAC GCCCAAGCAG TTTTCGGTGT 

2151 CGCACCGCAT CAAAGCCACA CAATCTGTAC ACGTTCGGAC TGGACGGGTC 

2201 TGACAAGTTG TACCGAAAAA ACCATTACCG ACGATAAAGT GATTGCTTCA 

2251 TTGAGCAAGA CCGACATCAG AGGCAATGTC AGCCTTGCCG ATCACGCTCA 

2301 TTTAAATCTC ACAGGACTTG CCACACTCAA CGGCAATCTT AGTGCAGGCG 

25 2351 GAGACACGCA CTATACGGTT ACGCGCAACG CCACCCAAAA CGGCAACCTC 

24 01 AGCCTCGTGG GCAATGCCCA AGCAACATTT AATCAAGCCA CATTAAACGG 

2451 CAACACATCG GCTTCGGACA ATGCTTCATT TAATCTAAGC AACAACGCCG 

2501 TACAAAACGG CAGTCTGACG CTTTCCGACA ACGCTAAGGC AAACGTAAGC 

2551 CATTCCGCAC TCAACGGCAA TGTCTCCCTA GCCGATAAGG CAGTATTCCA 

30 2601 TTTTGAAAAC AGCCGCTTTA CCGGAAAAAT CAGCGGCGGC AAGGATACGG 

2651 CATTACACTT AAAAGACAGC GAATGGACGC TGCCGTCGGG CACGGAATTA 

2 701 GGCAATTTAA ACCTTGACAA CGCCACCATT ACACTCAATT CCGCCTATCG 

2751 ACACGATGCG GCAGGCGCGC AAACCGGCAG TGCGGCAGAT GCGCCGCGCC 

2 801 GCCGTTCGCG CCGTTCCCTA TTATCCGTTA CGCCGCCAAC TTCGGCAGAA 

35 2 851 TCCCGTTTCA ACACGCTGAC GGTAAACGGC AAATTGAACG GTCAGGGAAC 

2 901 ATTCCGCTTT ATGTCGGAAC TCTTCGGCTA CCGCAGCGGC AAATTGAAGC 

2 951 TGGCGGAAAG TTCCGAAGGC ACTTACACCT TGGCTGTCAA CAATACCGGC 
3001 AACGAACCCG TAAGTCTCGA GCAATTGACG GTAGTGGAAG GAAAAGACAA 
3051 CACACCGCTG TCCGAAAATC TTAATTTCAC CCTGCaaaAc gaacacgtcg 

40 3101 atgccggcgc atggCGTTAT CAGCTTATCC gcaaagacgG CGAGTTCCgc 

3151 CTGCATAATC CGGTCAAAGA ACAAGAGCTT TCCGACAAAC TCGGCAAGgc 

3201 gggagaaACA GAggccgccT TGACGGCAAA ACAGGCacaA CTTGCCGCCA 

3251 AAcaacaggc ggaaaAAGAC AACgcgcaaa gccttgAcgc gctgattgcg 

3301 gCcgggcgca atgccaccga AAAGGCAgaa agtgttgccg aaccgGCCCG 

45 3 351 GCAGGCAGGC GGGGAAAAtg ccgGCATTAT GCAGGCGGAG GAAGAGAAAA 

34 01 AACGGGTGCA GGCGGATAAA GACACCGCCT TGGCGAAACA GCGCGAAGCG 

34 51 GAAACCCGGC CGGCTACCAC CGCCTTCCCC CGCGCCCGCC GCGCCCGCCG 

3501 GGATTTGCCG CAACCGCAGC CCCAACCGCA ACCCCAACCG CAGCGCGACC 

3551 TGATCAGCCG TTATGCCAAT AGCGGTTTGA GTGAATTTTC CGCCACGCTC 

50 3601 AACAGCGTTT TCGCCGTACA GGACGAATTG GACCGCGTGT TTGCCGAAGA 

3651 CCGCCGCAAC GCCGTTTGGA CAAGCGGCAT CCGGGACACC AAACACTACC 

3701 GTTCGCAAGA TTTCCGCGCC TACCGCCAAC AAACCGACCT GCGCCAAATC 

3751 GGTATGCAGA AAAACCTCGG CAGCGGGCGC GTCGGCATCC TGTTTTCGCA 

3801 CAACCGGACC GGAAACACCT TCGACGACGG CATCGGCAAC TCGGCACGGC 

55 3851 TTGCCCACGG TGCCGTTTTC GGGCAATACG GCATCGGCAG GTTCGACATC 

3 901 GGCATCAGCG CGGGCGCGGG TTTTAGTAGC GGCAGCCTTT CAGACGGCAT 

3 951 CAGAGGCAAA ATCCGCCGCC GCGTGCTGCA TTACGGCATT CAGGCAAGAT 

4 001 ACCGCGCAGG TTTCGGCGGA TTCGGCATCG AACCGCACAT CGGCGCAACG 
4051 CGCTATTTCG TCCAAAAAGC GGATTACCGA TACGAAAACG TCAATATCGC 

60 4101 CACCCCGGGC CTTGCATTCA ACCGCTACCG CGCGGGCATT AAGGCAGATT 

4151 ATTCATTCAA ACCGGCGCAA CACATTTCCA TCACGCCTTA TTTGAGCCTG 
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4201 TCCTATACCG ATGCCGCTTC CGGCAAAGTC CGAACGCGCG TCAATACCGC 

4251 CGTATTGGCG CAGGATTTCG GCAAAACCCG CAGTGCGGAA TGGGGCGTAA 

4 301 ACGCCGAAAT CAAAGGTTTC ACGCTGTCCC TCCACGCTGC CGCCGCCAAG 

4 351 GGGCCGCAAT TGGAAGCGCA GCACAGCGCG GGCATCAAAT TAGGCTACCG 

4401 CTGGTAA 



This is predicted to encode a protein having amino acid sequence [<SEQ ID 654>] (SEO ID NO: 



1 MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGILPQA RAGHTYFGIN 

51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSWSRNG 

101 VAALAGDQYI VSVAHNGGYN NVDFGAEGSN PDQHRFSYQI VKRNNYKAGT 

151 NGHPYGGDYH MPRLHKFVTD AEPVEMTSYM DGWKYADLNK YPDRVRIGAG 

201 RQYWRSDEDE PNNRESSYHI ASAYSWLVGG NTFAQNGSGG GTVNLGSEKI 

251 KHSPY GFLPT GGSFGDSGSP MFIYDAQ KQK WLIN GVLOTG NPYIGKSNGF 

301 QLVRKDWFYD EIFAGDTHSV FYEPHQNGKY FFNDNNNGAG KIDAKHKHYS 

351 LPYRLKTRTV QLFNVSLSET AREPVYHAAG GVNSYRPRLN NGENISFIDK 

401 GKGELILTSN INQGAGGLYF EGNFTVSPKN NETWQGAGVH ISDGSTVTWK 

451 VNGVANDRLS KIGKGTLLVQ AKGENQGSVS VGDGKVILDQ QADDQGKKQA 

501 FSEIGLVSGR GTVQLNADNQ FNPDKLYFGF RGGRLDLNGH SLSFHRIQNT 

551 DEGAMIVNHN QDKESTVTIT GNKDITTTGN NNNLDSKKEI AYNGWFGEKD 

601 ATKTNGGLNL NYPPEEADRT LLLSGGTNLN GNITQTNGKL FFSGRPTPHA 

651 YNHLGSGWSK MEGIPQGEIV WDNDWIDRTF KAENFHIQGG QAWSRNVAK 

701 VEGDWHLSNH AQAVFGVAPH QSHTICTRSD WTGLTSCTEK TITDDKVIAS 

751 LSKTDVRGNV SLADHAHLNL TGLATFNGNL VQAETRTIRL RANATQNGNL 

801 SLVGNAQATF NQATLNGNTS ASDNASFNLS NNAVQNGSLT LSDNAKANVS 

851 HSALNGNVSL ADKAVFHFEN SRFTGKISGG KDTALHLKDS EWTLPSGTEL 

901 GNLNLDNATI TLNSAYRHDA AGAQTGSAAD APRRRSRRSL LSVTPPTSAE 

951 SRFNTLTVNG KLNGQGTFRF MSELFGYRSG KLKLAESSEG TYTLAVNNTG 

1001 NEPVSLEQLT WEGKDNTPL SENLNFTLQN EHVDAGAWRY QLIRKDGEFR 

1051 LHNPVKEQEL SDKLGKAGET EAALTAKQAQ LAAKQQAEKD NAQSLDALIA 

1101 AGRNATEKAE SVAEPARQAG GENAGIMQAE EEKKRVQADK DTALAKQREA 

1151 ETRPATTAFP RARRARRDLP QPQPQPQPQP QRDLISRYAN SGLSEFSATL 

1201 NSVFAVQDEL DRVFAEDRRN AVWTSGIRDT KHYRSQDFRA YRQQTDLRQI 

1251 GMQKNLGSGR VGILFSHNRT GNTFDDGIGN SARLAHGAVF GQYGIGRFDI 

1301 GISAGAGFSS GSLSDGIRGK IRRRVLHYGI QARYRAGFGG FGIEPHIGAT 

1351 RYFVQKADYR YENVNIATPG LAFNRYRAGI KADYSFKPAQ HISITPYLSL 

1401 SYTDAASGKV RTRVNTAVLA QDFGKTRSAE WGVNAEIKGF TLSLHAAAAK 

1451 GPQLEAQHSA GIKLGYRW* 



Underlined and double-underlined sequences represent the active site of a serine protease (trypsin 
family) and an ATP/GTP-binding site motif A (P-loop). 

ORF1-1 (SEO ID NO: 650) and ORFlng (SEO ID NO: 654) show 93.7% identity in 1471 aa 



654) : 



overlap: 



orf 1-1 .pep 



10 20 30 40 50 60 

MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN 



orf lng-1 




20 30 40 50 60 



orf 1-1 .pep 



70 80 90 100 110 120 

KGKFAVGAKD I E VYNKKGELVGKSMTKAPM I D FSWS RNGVAALVGDQ Y I VSVAHNGGYN 
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I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I : I I I I I I I I I I I I I I I 
orf lng-1 KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALAGDQYIVSVAHNGGYN 

70 80 90 100 110 120 

130 140 150 160 170 180 

5 or f 1 - 1 . pep NVDFGAEGRNPDQHRFTYKI VKRNNYKAGTKGHPYGGDYHMPRLHKFVTDAEPVEMTS YM 

Mill I I I II I : I M I i I I I I' I I Ml I I I I II I I I I I I I II I I ' I I II I I I I M 
orf lng-1 NVDFGAEGSNPDQHRFSYQI VKRNNYKAGTNGHPYGGDYHMPRLHKFVTDAEPVEMTSYM 

130 140 150 160 170 180 

190 200 210 220 230 240 

] 0 orf 1 - 1 . pep DGRKYIDQNNYPDRVRIGAGRQYWRSDEDEPNNRESS YHIASAYSWLVGGNTFAQNGSGG 

II || I I M I I I I I I I I I I I I I I ' I I I I I I I I I I I II I I I I! I I I I . I I I I I I I I I 
orf lng-1 DGWKY AD LNKYPDRVR I GAGRQYWRSDEDEPNNRESS YHIASAYSWLVGGNTFAQNGSGG 

190 200 210 220 230 240 

250 260 270 280 290 300 

1 5 orf 1-1 .pep GTVNLGS EKI KHS P YGFLPTGGS FGDSGS PMF I YDAQKQKWL INGVLQTGNPY IGKSNGF 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 ■ I I I I I I I I I I I I I I M I I I I I I M I I I I I I I I I I I I I I M I I 
orf lng-1 GTVNLGSEKI KHS PYGFLPTGGS FGDSGS PMF I YDAQKQKWL INGVLQTGNPY IGKSNGF 

250 260 270 280 290 300 

310 320 330 340 350 360 

20 or f 1 - 1 . pep QLVRKDWFYDEI FAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRTV 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 Mi 1 1 IIMIMIM M III IMIIII 

orf lng-1 QLVRKDWFYDEIFAGDTHSVFYEPHQNGKYFFNDNNNGAGKIDAKHKHYSLPYRLKTRTV 

310 320 330 340 350 360 

370 380 390 400 410 420 

25 orf 1-1 .pep QLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLYF 

I I I I I I I I I I I I I I I U I I I I I I I I I I I I I I I I I I II M I i I I I 1 I I I I I I I I I , I I 
orf lng-1 QLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDKGKGELILTSNINQGAGGLYF 

370 380 390 400 410 420 

430 440 450 460 470 480 

30 orf 1-1 .pep QGDFTVSPENNETWQGAGVHI SEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGS I S 

: I : I I I I h I I I I I I I I I I I I I • I II III III III MM MM I II III I III I I hi 
O r f 1 ng - 1 EGNFTVS P KNNETWQGAGVH I SDGS T VTWKVNGVANDRLS KI GKGTLLVQAKGENQGS VS 

430 440 450 460 470 480 

490 500 510 520 530 540 

35 orf 1-1 .pep VGDGTVILDQQADDKGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNGH 

MM I I I I I I I I: I I I I I I I I I I I I I M 1 M I M I I I I I I M I I I I I I I I I 1 I h I 
orf lng-1 ■ VGDGKVILDQQADDQGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNGH 

490 500 510 520 530 540 

550 560 570 580 590 600 

40 orf 1 - 1 . pep SLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDIATTGNNNSLDSKKEIAYNGWFGEKD 

IIIIMIIII IIIIIIIIIIIMMIIIMI MIIMhlllllllll llllll 

' orf lng-1 SLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDITTTGNNNNLDSKKEIAYNGWFGEKD 

550 560 570 580 590 600 

610 620 630 640 650 660 

45 orf 1 - 1 . pep TTKTNGRLNLVYQPAAEDRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLNDHWSQ 

MIIMIMI Ml 1 1 M M 1 1 1 II 1 1 II 1 1 1 M II 1 1! II 1 1 1 II I II 1 1 - I h 

orf lng-1 ATKTNGRLNLNYQPEEADRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSGWSK 

610 620 630 640 650 660 
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670 680 690 700 710 720 

orf 1- 1 . pep KEGIPRGEIVWDNDWINRTFKAENFQIKGGQAWSRNVAKVKGDWHLSNHAQAVFGVAPH 

MMMMMMMMI MM MMMMMMMMMMMMMI MMMM 

- orf lng- 1 MEG I PQGE I VWDNDW I DRTFKAENFH I QGGQA WSRNVAKVEGDWHLSNHAQAVFGVAPH 

5 670 680 690 700 710 720 

730 740 750 760 770 780 

orf 1- 1 . pep QSHTICTRSDWTGLTNCVEKTITDDKVIASLTKTDISGNVDLADHAHLNLTGLATLNGNL 

M M M M M M M M M M M MM M M M M 1 1 1 = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf lng- 1 QSHTICTRSDWTGLTSCTEKTITDDKVIASLSKTDIRGNVSLADHAHLNLTGLATLNGNL 
10 730 740 750 760 770 780 

790 800 810 820 830 840 

orf 1 - 1 . pep SANGDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNTSASGNASFNLSDHAVQNGSLT 

MMMIMMIMM MM MMIMIMIMIMI MMM-MMMM 

orf lng- 1 SAGGDTHYTVTRNATQNGNLSLVGNAQATFNQATLNGNTSASDNASFNLSNNAVQNGSLT 
15 790 800 810 820 830 840 

850 860 870 880 890 900 

orf 1 - 1 . pep LSGNAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSGTEL 

M MIMIII MIIIIIIMIIIMIIMIIIM IMMIMIMIMI Mill II 

orf lng- 1 LSDNAKANVSHSALNGNVSLADKAVFHFENSRFTGKISGGKDTALHLKDSEWTLPSGTEL 
20 850 860 870 880 890 900 

910 920 930 940 950 960 

orf 1 - 1 . pep GNLNLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLSVTPPTSVESRFNTLT 

M M M I M M 1 1 1 1 M 1 1 M 1 1 1 Ml 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 h I II 1 1 1 II 

orf lng- 1 GNLNLDNAT I TLNSAYRHDAAGAQTGSAADAPRRRSR RSLLSVTPPTSAESRFNTLT 

25 910 920 930 940 950 

970 980 990 1000 1010 1020 

orf 1- 1 . pep VNGKLNGQGTFRFMSELFGYRSDKIiKLAESSEGTYTLAVNNTGNEPASLEQLTVVEGKDN 

1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 E 1 1 1 1 MMMMMMMMI MMMMMMMMM 

orf lng- 1 VNGKLNGQGTFRFMSELFGYRSGKLKLAESSEGTYTLAVWNTGNEPVSLEQLTWEGKDN 
30 960 970 980 990 1000 1010 

1030 1040 1050 1060 1070 
orf 1- 1 . pep KPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKA 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIMIIIIIIIIIII 

orf lng- 1 TPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAGETEAALTAK 
35 1020 1030 1040 1050 1060 1070 

1080 1090 1100 1110 1120 

orf 1 - 1 . pep EAKKQAEKDNAQSLDALIAAGRDAVEKTESVAEPARQAGGENVGIMQAEEEKKRVQ 

M M M M M M M M M M MM M M M I M M M M M M M M M M M M 

orf lng- 1 QAQLAAKQQAEKDNAQSLDALIAAGRNATEKAESVAEPARQAGGENAGIMQAEEEKKRVQ 
40 1080 1090 1100 1110 1120 1130 

1130 1140 1150 1160 1170 1180 

orf 1-1 .pep ADKDTALAKQREAETRPATTAFPRARRARRDLPQLQPQPQPQPQRDLISRYANSGLSEFS 

llllllllllllllllllllllllllllllllll I M M M MM I M M I M I M 1 1 

orf lng- 1 ADKDTALAKQREAETRPATTAFPRARRARRDLPQPQPQPQPQPQRDLISRYANSGLSEFS 
45 1140 1150 1160 1170 1180 1190 

1190 1200 1210 1220 1230 1240 

orf 1-1 .pep ATLNS VFAVQDELDRVFAEDRRNAVWTSG I RDTKHYRSQDFRAYRQQTDLRQ I GMQKNLG 

M M M M M I M M M M I 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M M M 1 1 1 1 1 1 1 1 1 

o r f 1 ng - 1 ATLNS VFAVQDELDRVFAEDRRNAVWTSG I RDTKH YRSQD FRAYRQQTDLRQ I GMQKNLG 

50 1200 1210 1220 1230 1240 1250 
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1250 1260 1270 1280 1290 1300 

orf 1- 1 . pep SGRVGILFSHNRTENTFDDGIGNSARLAHGAVFGQYGIDRFYIGISAGAGFSSGSLSDGI 

lllllllllllll IIIIIIIIMI IIMIMIIIII II 1 1 1 1 1 1 1 1 1 1 ! I II I M I 

orf lng-1 SGRVGILFSHNRTGNTFDDGIGNSARLAHGAVFGQYGIGRFDIGISAGAGFSSGSLSDGI 
5 1260 1270 1280 1290 1300 1310 

1310 1320 1330 1340 1350 1360 

orf 1-1 .pep GGKI RRRVLHYG I QARYRAGFGGFG I E PH I GATRYFVQKADYRYENVN I ATPGLAFNRYR 

II III II I Mil II IIMIMIIIII lllllllllllll II IN I lllllllllllll I 

orf lng-1 RGK I RRRVLHYG I QARYRAGFGGFG IE PHI GATRYFVQKADYRYENVN I ATPGLAFNRYR 

10 1320 1330 1340 1350 1360 1370 

1370 1380 1390 1400 ' 1410 1420 

orfl-l.pep AG I KAD YS FKPAQH I S I TPYLS LS YTDAASGKVRTRVNTAVLAQD FGKTRS AEWGVNAE I 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I II I I I 
orflng-1 AG I KAD YS FKPAQH IS I TPYLS LS YTDAASGKVRTRVNTAVLAQD FGKTRS AEWGVNAE I 

15 1380 1390 1400 1410 1420 1430 

1430 1440 1450 

orf 1 - 1 . pep KGFTLSLHAAAAKGPQLEAQHSAGIKLGYRWX 

1 1 1 1 M 1 1 1 1 1 : 1 1 ! 1 1 1 1 1 1 1 1 1 M 1 1 1 

orf lng-1 KGFTLSLHAAAAKGPQLEAQHSAGIKLGYRWX 
20 1440 1450 1460 

In addition, ORFlng (SEP ID NO: 654) shows 55.7% identity with hap protein (P45387) (SEP ID 
NO: 1153) over a 1455aa overlap: 

SCORES Initl: 1104 Initn: 4632 Opt: 2680 

Smith-Waterman score: 5165; 55.7% identity in 1455 aa overlap 

25 10 20 30 40 50 60 

orflng-1 . pep MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQARAGHTYFGINYQYYRDFAEN 

I :h hhlh II I I I I I I I M I I I I Ml II 
P 4 5387 MKKTVFRLNFLTACISLGIVSQAWAGHTYFGIDYQYYRDFAEN 

10 20 30 40 

30 70 80 90 100 110 120 

orflng-1 . pep KGKFAVGAKD I E VYNKKGELVGKSMTKAPM IDFS WSRNGVAALAGDQY I VS VAHNGGYN 

I I I I :M h : hi I I I : h I I I I I I I I I I II I I II I I I I I II h MIIIIMM Ih 
p4 5 3 8 7 KGKFTVGAQNI KVYNKQGQLVGTSMTKAPM IDFS WSRNGVAALVENQ Y I VS VAHNVGYT 

50 60 70 80 90 100 

35 130 140 150 160 170 180 

orflng-1 . pep NVDFGAEGSNPDQHRFSYQIVKRNNYKAGTNGHPYGGDYHMPRLHKFVTDAEPVEMTSYM 

: 1 1 1 1 1 1 M 1 1 1 1 1 M : 1 1 1 1 1 1 1 I III III MINIM MMI I 

p4 53 87 DVDFGAEGNNPDQHRFTYKIVKRNNYKKD-NLHPYEDDYHNPRLHKFVTEAAPIDMTSNM 
110 120 130 140 150 160 

40 190 200 210 220 230 240 

orflng-1 . pep DGWKYADLNKYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSGG 

:| hi Mlhllllhllhlhhh : -MMI =1 — 111 I hh 

p4 5387 NGSTYSDRTKYPERVRIGSGRQFWRNDQDKGD QVAGAYHYLTAGNTHNQRGAGN 

170 180 190 200 210 

45 250 260 270 280 290 300 

orf Ing- 1 . pep GTVNLGSEKIKHSPYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGF 

I Ih: I : II II Ml M 1 1 M I M M MM 1 1 1 1 1 M M MM II Ml 



CHIR-0160 (356.001) 



-484- 



PATENT 



p4 53 87 GYSYLGGDVRKAGEYGPLPIAGSKGDSGSPMFIYDAEKQKWLINGILREGNPFEGKENGF 
220 230 240 250 260 270 

310 320 330 340 350 360 

orf lng- 1 . pep QLVRKDWFYDEIFAGDTHSVFYEPHQNGKYFFNDNNNGAGKIDAKHKHYSLPYRLKTRTV 
llllh: | | | | | | :| II I :: hll |:| | ::| ::| 

p4 53 87 QLVRKS YF - DE I FERDLHTSLYTRAGNGVYTI SGNDNGQGS I TQKS GIPSEIK 1 

280 290 300 310 320 



370 380 390 400 410 419 

orf lng- 1 .pep QLFNVSLSETAREPVYHAA-GGVNSYRPRLNNGENISFIDKGKGELILTSNINQGAGGLY 

| |:|| :: - III 111111- hh :| I I := I = I I I I I I I I I 

p4 5387 TLANMSLPLKEKDKVHNPRYDGPNIYSPRLNNGETLYFMDQKQGSLIFASDINQGAGGLY 

330 340 350 360 370 380 



420 430 440 450 460 470 479 

orf lng- 1 . pep FEGNFTVSPKNNETWQGAGVHISDGSTVTWKVNGVANDRLSKIGKGTLLVQAKGENQGSV 

I I I M M I I -I : M I I I I : I : I- I M I I I I I M HIM II II III II I.I I I hi h 
p4 5387 FEGNFTVS PNSNQTWQGAGIHVSENSTVTWKVNGVEHDRLSKIGKGTLHVQAKGENKGSI 

390 400 410 420 430 440 



480 490 500 510 520 530 539 

orf lng- 1 . pep S VGDGKV I LDQQADDQGKKQAFS E I GLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNG 

I I I M I I I : II I I I I M I II I M II I I > I I I I I I I = I I : I h I I I I I I I I I I I I I 
P4 53 87 SVGDGKVILEQQADDQGNKQAFSEIGLVSGRGTVQLNDDKQFDTDKFYFGFRGGRLDLNG 
450 460 470 480 490 500 

540 550 560 570 580 590 

orf lng- 1 .pep HSLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDITT-TGNN-NNLDSKKEIAYNGWFG 

I M : |: I I I I I 1 I I i I I II I : ::|i||||::h :||| hll MIMIIMM 
p453 87 HSLTFKRIQNTDEGAMIVNHNTTQAANVTITGNESIVLPNGNNINKLDYRKEIAYNGWFG 
510 520 530 540 550 560 



600 610 620 630 640 650 

or f lng- 1 . pep EKDATKTNGRLNLNYQPEEADRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSG 

I I :| MM hi Ml IMIMMMMM Mllllllll IMIIIM: 

p4 53 87 ETDKNKHNGRLNLIYKPTTEDRTLLLSGGTNLKGDITQTKGKLFFSGRPTPHAYNHLNKR 
570 580 590 600 610 620 

660 670 680 690 700 710 

orf lng- 1 . pep - WSKMEGIPQGEIVWDNDWIDRTFKAENFHIQGGQAVVSRNVAKVEGDWHLSNHAQAVFGV 

IMIIIIII MIMIIIMM IMIMMIMI lll|::MIM MMMMIII 

p45387 WSEMEGIPQGEIVWDHDWINRTFKAENFQIKGGSAWSRNVSSIEGNWTVSNNANATFGV 
630 640 650 660 670 680 



720 730 740 750 760 770 

orf lng- 1 .pep APHQ SHT I CTRS DWTGLTS CTE KT I TDDKV I AS LS KTD I RGNVS LADHAHLNLTGLATLN 

40 :MMMMI MIIIIMI : : I I I I I h I h I h : : h h I h III II 

p4 53 87 VPNQQNTICTRSDWTGLTTCQKVDLTDTKVINSIPKTQINGSINLTDNATANVKGLAKLN 

690 700 710 720 730 740 

780 790 800 810 820 830 

or f lng- 1 . pep GNLS AGGDTH YT VTRNATQNGNLS LVGNAQATFNQATLNGNTS ASDNAS FNLSNNAVQNG 
45 ||:: :::::hllllhl I 

p4 53 8 7 GNVTL TNHSQFTLSNNATQIG 

750 760 770 



orf lng- 1 .pep 



840 850 860 870 880 890 

SLTLSDNAKANVSHSALNGNVS LAD KAVFH FENS RFTGKISGGKDTALHLKDSEWTLPSG 



CHIR-0160 (356.001) 



-485- 



PATENT 



:= lllh hh- Mill IMM I = Mhh MM I h= h = = MMI 
p4 5 3 8 7 NIRLSDNSTATVDNANLNGNVHLTDSAQFSLKNSHFSHQIQGDKGTTVTLENATWTMPSD 

780 790 800 810 820 830 

900 910 920 930 940 950 

5 orf lng-1 .pep TELGNLNLDNATITLNSAYRHDAAGAQTGSAADAPRRRSRRSLLSVTPPTSAESRFNTLT 

I I IIMMMIIIIMI =M: -MMI I : I Mill MUM 

p4 5 3 8 7 TTLQNLTLNNS T I TLNSAY S AS SNNT PRRRS LETETTPTS AEHRFNTLT 

840 850 860 870 

960 970 980 990 1000 1010 

10 orf lng-1 .pep VNGKLNGQGTFRFMSELFGYRSGKLKLAESSEGTYTLAVNNTGNEPVSLEQLTWEGKDN 

IMMMIIMM I 1111 = 1 1 1 1 1 - - 1 1 I IM MIMI MIIIIMIMII 

p4 5387 VNGKLSGQGTFQFTSSLFGYKSDKLKLSNDAEGDYILSVRNTGKEPETLEQLTLVESKDN 
880 890 900 910 920 930 

1020 1030 1040 1050 1060 1070 

1 5 orf lng- 1 . pep TPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAGETEAALTAK 

|||::|:Mh|:|lllll II : I = = = I II I I I I I h I I I I I = I I = = l I II 
p4 5 3 8 7 QPLSDKLKFTLENDHVDAGALRYKLVKNDGEFRLHNP I KEQELHNDLVRAEQAERTLEAK 

940 950 960 970 980 990 

1080 1090 1100 1110 1120 1130 

20 or f lng- 1 . pep QAQLAAKQQAEKDNAQSLDALI AAGRNAT- EKAESVAEPARQAGGENAGIMQAEEEKKRV 

|:: :|| |: : =.= :| I I I :: s = = I hi I =1 :::= : |:| 
p4 5387 QVEPTAKTQTGEPKVRSRRAARAAFPDTLPDQSLLNALEAKQAE-LTAETQKSKAKTKKV 
1000 1010 1020 1030 1040 1050 

1140 1150 1160 1170 1180 1190 

25 orf lng- 1 . pep QADK- - -DTALAKQREAETRPATTAFPRARRARRD-LPQPQPQPQPQPQRDLISRYANSG 

:: : : I I = I :: : : : : : | | | : : I : I : I I I I I I : I I : 

p4 5387 RSKRAVFSDPLLDQSLFALEAALEVIDAPQQSEKDRLAQEEAEKQ-RKQKDLISRYSNSA 
1060 1070 1080 1090 1100 1110 

1200 1210 1220 1230 1240 1250 

30 orf lng-1. pep LS E FS ATLNS VFAVQDELDRVFAEDRRNAVWTSGI RDTKHYRSQDFRAYRQQ - TDLRQI G 

|||:|||:||:::|||||l|:|::: = = 1111= =1 = = l h Nihil hlllll 
P45387 LSELSATVNSMLSVQDELDRLFVDQAQSAVWTNIAQDKRRYDSDAFRAYQQQKTNLRQIG 
1120 1130 1140 1150 1160 1170 

1260 1270 1280 1290 1300 1310 

35 orf lng- 1 . pep MQKNLGSGRVGILFSHNRTGNTFDDGIGNSARLAHGAVFGQYGIGRFDIGISAGAGFSSG 

:|| I = = 1 I = I :Mhh lllh = I I h = 1 = 11 I : : : | : : : | : | = I : : 
p45387 VQKALANGRIGAVFSHSRSDNTFDEQVKNHATLTMMSGFAQYQWGDLQFGVNVGTGISAS 
1180 1190 1200 1210 1220 1230 

1320 1330 1340 1350 1360 1370 

40 orf lng-1 .pep SLSDGIRGKIRRRVLHYGIQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGL 

:::: | | : | : : : : | | : : | h =1 = I I = 1 = = I = = I I I = = = =1= hi = 11 = 1 

p45387 KMAEEQSRKIHRKAINYGVNASYQFRLGQLGIQPYFGVNRYFIERENYQSEEVRVKTPSL 
1240 1250 1260 1270 1280 1290 

1380 1390 1400 1410 j 1420 1430 

45 orf lng- 1 . pep AFNRYRAGIKADYSFKPAQHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEW 

Mill I I I = = I I = 1 h = = lh 11= ::|:|:::::|:| II =11 I 11= =1 
p453 87 AFNRYNAGIRVDYTFTPTDNISVKPYFFVNYVDVSNANVQTTVNLTVLQQPFGRYWQKEV 

1300 1310 1320 1330 1340 1350 

1440 1450 1460 1469 
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orf Ing- 1 . pep GVNAE I KGFTLS LHAAAAKGPQLEAQHS AGI KLGYRWX 

|::||| I H : - II h-hllllll 
p4 5 3 8 7 GLKAEILHFQ I S AF I S KSQGSQLGKQQNVGVKLGYRW 

1360 1370 1380 1390 

Based on this analysis, it is predicted that these proteins from N. meningitidis and N. gonorrhoeae, 
and their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 78 

The following partial DNA sequence was identified in ^meningitidis [<SEQ ID 655>] (SEP ID 
NO: 655) : 



1 . . AAGGTGTGGC AATTTGTCGA AGA.CCGCTG CGTGCCGTCG TGCCTGCCGA 

51 CAGTTTTGAA CCGACCGCGC AAAAATTGAA CCTGTTTAAG GCGGGTGCGG 

101 CAACCATTTT GTTTTATGAA GATCAAAATG TCGTCAAAGG TTTGCAGGAG 

151 CAGTTCCCTG CTTATGCCGC TAACTTCCCC GTTTGGGCGg ATCAGGCAAA 

201 CGCGATGGTG CAGTATGCCG TTTGGACGAC ACTTGCCGCG GTCGGCGTAG 

2 51 GTGCAAACCT GCAACATTAC AATCCCTTGC CCGATGCGGC GATTGCCAAA 
301 GCGTGGAATA TCCCCGAAAA CTGGTTGTTG CGCGCACAAA TGGTTATCGG 

3 51 CGGTATTGAA GGGGCGGCAG GTGAAAAGAC CTTTGAACCC GTTGCAGAAC 

4 01 GTTTGAAAGT GTTCGGCGCA TAA 

This corresponds to the amino acid sequence [<SEQ ID 656; ORF6>] (SEP ID NO: 656: ORF6) : 

1 . .KVWQFVEXPL RAWPADSFE PTAQKLNLFK AGAATILFYE DQNWKGLQE 

51 QFPAYAANFP VWADQANAMV QYAVWTTLAA VGVGANLQHY NPLPDAAIAK 

101 AWNIPENWLL RAQMVIGGIE GAAGEKTFEP VAERLKVFGA * 

Further sequence analysis revealed a further partial DNA sequence [<SEQ ID 657>] (SEP ID NO: 
657) : 



1 . . CTGCGTGCCG TCGTGCCTGC CGACAGTTTT GAACCGACCG CGCAAAAATT 

51 GAACCTGTTT AAGGCGGGTG CGGCAACCAT TTTGTTTTAT GAAGATCAAA 

101 ATGTCGTCAA AGGTTTGCAG GAGCAGTTCC CTGCTTATGC CGCTAACTTC 

151 CCCGTTTGGG CGGATCAGGC AAACGCGATG GTGCAGTATG CCGTTTGGAC 

201 GACACTTGCC GCGGTCGGCG TAGGTGCAAA CCTGCAACAT TACAATCCCT 

251 TGGCCGATGC GGCGATTGCC AAAGCGTGGA ATATCCCCGA AAACTGGTTG 

3 01 TTGCGCGCAC AAATGGTTAT CGGGGGTATT GAAGGGGCGG CAGGTGAAAA 

351 GACCTTTGAA CCCGTTGCAG AACGTTTGAA AGTGTTCGGC GCATAA 

This corresponds to the amino acid sequence [<SEQ ID 658; PRF6-1>] (SEP ID NO: 658: ORF6- 
1}: 

1 . . LRAWPADSF EPTAQKLNLF KAGAATILFY EDQNWKGLQ EQFPAYAANF 

51 ■ PVWADQANAM VQYAVWTTLA AVGVGANLQH YNPLPDAAIA KAWNIPENWL 

101 LRAQMVIGGI EGAAGEKTFE PVAERLKVFG A* 



Computer analysis of this amino acid sequence gave the following results: 
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Homology with a predicted ORF from N.menimitidis (strain A) 

ORF6 (SEP ID NO: 656) shows 98.6% identity over a 140aa overlap with an ORF (ORF6a) (SEO 
ID NO: 660) from strain A of N. meningitidis: 

10 20 30 

orf6 pep KVWQFVEXPLRAWPADSFEPTAQKLNLFK 

lllllll I I I I I I I I II I i I I I II I I I 
orf6a QIVEHAVLHTPSSFNSQSARVWLFGEEHDKVWQFVEDALRAVVPADSFEPTAQKLNLFK 

40 50 60 70 80 90 

40 50 60 70 80 90 

or f 6 . pep AGAATILFYEDQNWKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHY 

I llllllllll IIIIIMIIIII IMIIIIIMIMMIIIIIMIII llllllll 

orf6a AGAATILFYEDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHY 
100 110 120 130 140 150 

100 110 120 130 140 

orf 6 . pep NPLPDAAIAKAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 6a N PL PDAA I AKAWN I P ENWLLRAQM V I GG I EGAAGE KT FE P VAERL KVFGAX 

160 170 180 190 200 

The complete length ORF6a nucleotide sequence [<SEQ ID 659>] (SEO ID NO: 659) is: 



1 ATGACCCGTC AATCTCTGCA ACAGGCTGCC GAAAGCCGCC GTTCCATTTA 

51 TTCGTTAAAT AAAAATCTGC CCGTCGGCAA AGATGAAATC GTCCAAATCG 

101 TCGAACACGC CGTTTTGCAC ACACCTTCTT CGTTCAATTC CCAATCTGCC 

151 CGTGTGGTCG TGCTGTTTGG CGAAGAGCAT GATAAGGTGT GGCAATTTGT 

201 CGAAGACGCG CTGCGTGCCG TCGTGCCTGC CGACAGTTTT GAACCGACCG 

251 CGCAAAAATT GAACCTGTTT AAGGCGGGTG CGGCAACTAT TTTGTTTTAT 

3 01 GAAGATCAAA ATGTCGTCAA AGGTTTGCAG GAGCAGTTCC CTGCTTATGC 
351 CGCCAACTTT CCCGTTTGGG CGGACCAGGC GAACGCGATG GTGCAGTATG 

4 01 CCGTTTGGAC GACACTTGCC GCGGTCGGCG TAGGTGCAAA CCTGCAACAT 
4 51 TACAATCCCT TGCCCGATGC GGCGATTGCC AAAGCGTGGA ATATCCCCGA 
501 AAACTGGTTG TTGCGCGCAC AAATGGTTAT CGGCGGTATT GAAGGGGCGG 
551 CAGGTGAAAA GACCTTTGAA CCAGTTGCAG AACGTTTGAA AGTGTTCGGC 
601 GCATAA 

This is predicted to encode a protein having amino acid sequence [<SEQ ID 660>] (SEO ID NO: 
660) : 



1 MTRQSLQQAA ESRRSIYSLN KNLPVGKDEI VQIVEHAVLH TPSSFNSQSA 

51 RWVLFGEEH DKVWQFVEDA LRAWPADS F EPTAQKLNLF KAGAATILFY 

101 EDQNWKGLQ EQFPAYAANF PVWADQANAM VQYAVWTTLA AVGVGANLQH 

151 YN PL PDAA I A KAWNIPENWL LRAQMVIGGI EGAAGE KTFE PVAERLKVFG 

201 A* 



ORF6a (SEO ID NO: 660) and ORF6-1 (SEO ID NO: 658) show 100.0% identity in 131 aa 
overlap: 
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50 60 70 80 90 100 

orf 6a . pep TPSSFNSQSARVWLFGEEHDKVWQFVEDALRAWPADSFEPTAQKLNLFKAGAATILFY 

I I I I I I I ' I I I I I I I I I I I I I I I I I II I 
orf 6-1 LRAWPADS FEPTAQKLNLFKAGAAT I LFY 

5 10 20 30 

110 120 130 140 150 160 

orf 6a . pep EDQNWKGLQEQFPAYAANFPWADQANAIWQYAVWTTLAAVGVGANLQHYNPLPDAAIA 

- I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I II ; I I I I I II M I I I I I I I 

or f 6 - 1 EDQNWKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHYNPLPDAAI A 

10 40 50 60 70 80 90 

170 180 190 200 

orf 6a . pep KAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I M 
orf 6- 1 KAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX 
15 100 110 120 130 

Homology with a predicted ORF from N. gonorrhoeae 

ORF6 (SEP ID NO: 656) shows 95.7% identity over a 140aa overlap with a predicted ORF 
(ORF6ng) (SEP ID NO: 662) from N. gonorrhoeae: 

orf 6 .pep KVWQFVEXPLRAWPADSFEPTAQKLNLFK 3 0 

20 | | | | | | | | | | | | | | | M | | | | | | | : | | | 

orf6ng SNVSLDMSNPTVLRMGLPLY I AS LRRGA I YKVWQFVEDALRA WP ADS FEPTAQKLKLFK 64 

orf 6 .pep AGAATILFYEDQNWKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHY 90 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I I I I I I . I I I I I I I h I I I I I I 
orf 6ng AGAATILFYEDQNWKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGAGANLQHY 124 

25 orf 6 .pep NPLPDAAIAKAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGA 140 

I I I I :| I I I I I I I I I I I I I I I I I I I I I I I I I I h I i I I II I I I I I I I 
orf 6ng NPLPDVAIAKAWNIPENWLLRAQMVIGGIEGAAGEKVFEPVAERLKVFGA 174 

The complete length ORF6ng nucleotide sequence [<SEQ ID 661 >] (SEP ID NO: 661) was 
identified as: 

30 1 ATGGCCGTTG CGTCAAATGT CAGCTTGGAT ATGTCCAATC CTACGGTGTT 

51 ACGCATGGGA TTACCCTTAT ATATTGCGTC CCTAAGAAGG GGCGCAATAT 

101 ATAAGGTGTG GCAATTTGTC GAAGACGCGC TGCGTGCCGT CGTGCCTGCC 

151 GACAGTTTTG AACCGACCGC GCAAAAATTG AAGCTGTTTA AGGCGGGCGC 

201 GGCAACCATT TTGTTTTATG AAGATCAAAA TGTCGTCAAA GGTTTGCAGG 

35 251 AGCAGTTCCC TGCTTATGCC GCCAACTTTC CCGTTTGGGC GGACCAGGCG 

301 AACGCTATGG TACAGTATGC CGTCTGGACG ACACTTGCCG CGGTCGGTGC 

351 AGGTGCAAAT CTGCAACATT ACAACCCCTT GCCCGATGTG GCGATTGCTA 

4 01 AAGCGTGGAA TATTCCCGAA AACTGGCTGT TGCGCGCGCA AATGGTTATC 

4 51 GGTGGTATTG AAGGGGcggc aggtgaaaaa gtctttgaac CCGTTGCgga 

40 501 acgtttgAAA GTGTTCGGCG CATAA 

This encodes a protein having amino acid sequence [<SEQ ID 662>] (SEP ID NO: 662) : 



1 MAVASNVSLD MSNPTVLRMG LPLYIASLRR GAIYKVWQFV EDALRAWPA 
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51 DSFEPTAQKL KLFKAGAATI LFYEDQNWK GLQEQFPAYA ANFPVWADQA 
101 NAMVQYAVWT TLAAVGAGAN LQHYNPLPDV AIAKAWNIPE NWLLRAQMVI 
151 GGIEGAAGEK VFEPVAERLK VFGA* 



ORF6ng (SEP ID NO: 662) and ORF6-1 (SEP ID NO: 658) show 96.9% identity in 131 aa 
overlap: 



10 20 30 

LRAWPADSFEPTAQKLNLFKAGAATILFY 

1 1 II 1 1 1 M 1 1 M 1 1 M 1 1 : 1 I I I M M 

PTVLRMGLPLYIASLRRGAIYKVWQFVEDALRAWPADSFEPTAQKLKLFKAGAATILFY 
20 30 40 50 60 70 

40 50 60 70 80 90 

EDQNWKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHYNPLPDAAIA 

1 1 Ml 1 1 1 1 1 1 1 1 M I M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 hi I M 1 1 1 1 1 II U 1 1 

EDQ^^WKGLQEQFPAYAANFPWADQANA^rVQYAVWTTLAAVGAGANLQHYNPLPDVAIA 
80 90 100 110 120 130 

100 110 120 130 

KAWN I PENWLLRAQMVI GGI EGAAGEKTFE PVAERLKVFGAX 

I I I I I I I I I I I I I I I I I I I I I I I I : M I I I I I I I I I I I I I 
KAWN I PENWLLRAQMVI GGI EGAAGEKVFE PVAERLKVFGAX 

140 150 160 170 



It is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could 
be useful antigens for vaccines or diagnostics, or for raising antibodies. 



orf 6-1 . pep 
orf 6ng 

orf 6-1 .pep 
orf 6ng 

orf 6-1 .pep 
orf 6ng 



Example 79 

The following partial DNA sequence was identified in N .meningitidis [<SEQ ID 663>] (SEP ID 
NO: 663) 



1 . . GGCTACAACT ACCTGTTCGC GCGCGGCAGC CGCATCGCCA ACTACCAAAT 

51 CAACGGCATC CCCGTTGCCG ACGCGCTGGC CGATACGGGt CAATGCCAAC 

101 ACCGCCGCCT ATGAGCGCGT AGAAGTCGTG CGCGGCGTGG CGGGGCTGCT 

151 GGACGGCACG GGCGAGCCTT CCGCCACCGT CAATCTGGTG CGCAAACGCC 

2 01 TGACCCGCAA GCCATTGTTT GAAGTCCGCG CCGAAGCgGG CAACCGcAAA 
251 CATTTCGGGC TGGACGCGGA CGTATCGGGC AGCCTGAACA CCGAAG.crC 

3 01 rCTGCGCgGC CGCCTGGTTT CCAcCTTCGG ACGCGGCGAC TCGTGGCGGC 

3 51 GGCGCGAACG CAGCCGskAT GCCGAACTCT ACGGCATTTT GGAATACGAC 

4 01 ATCGCACCGC AAACCCGCGT CCACGCArGC ATGGACTACC AGCAGGCGAA 
4 51 AGAAACCGCC GACGCGCCGC TCAGcTACGC CGTGTACGAC AGCCAAGGTT 
501 ATGCCACCGC CTTCGGCCCG AAAGACAACC CCGCCACAAA TTGGGCGAAC 
551 AGCCACCACC GTGCGCTCAA CCTGTTCGCC GGCATCGAAC ACCGCTTCAA 
601 ' CCAAGACTGG AAACTCAAAG CCGAATACGA CTAC. . 



This corresponds to the amino acid sequence [<SEQ ID 664; ORF23>] (SEP ID NO: 664; 
ORF23) : 
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1 . .GYNYLFARGS RIANYQINGI 

51 DGTGEPSATV NLVRKRLTRK 

101 LRGRLVSTFG RGDSWRRRER 

151 ETADAPLSYA VYDSQGYATA 

201 QDWKLKAEYD Y. . 



PVADALADTG NANTAAYERV EWRGVAGLL 
PLFEVRAEAG NRKHFGLDAD VSGSLNTEXX 
SRXAELYGIL EYDIAPQTRV HAXMDYQQAK 
FGPKDNPATN WANSHHRALN LFAGIEHRFN 



Further work revealed the complete nucleotide sequence [<SEQ ID 665>] (SEP ID NO: 665) : 



1 ATGACACGCT TCAAATATTC CCTGCTGTTT GCCGCCCTGT TGCCCGTGTA 

51 CGCGCAGGCC GATGTTTCTG TTTCAGACGA CCCCAAACCG CAGGAAAGCA 

101 CTGAATTGCC GACCATCACC GTTACCGCCG ACCGCACCGC GAGTTCCAAC 

151 GACGGCTACA CTGTTTCCGG CACGCACACC CCGCTCGGGC TGCCCATGAC 

201 CCTGCGCGAA ATCCCGCAGA GCGTCAGCGT CATCACATCG CAACAAATGC 

251 GCGACCAAAA CATCAAAACG CTCGACCGCG CCCTGTTGCA GGCGACCGGC 

301 ACCAGCCGCC AGATTTACGG CTCCGACCGC GCGGGCTACA ACTACCTGTT 

351 CGCGCGCGGC AGCCGCATCG CCAACTACCA AATCAACGGC ATCCCCGTTG 

401 CCGACGCGCT GGCCGATACG GGCAATGCCA ACACCGCCGC CTATGAGCGC 

451 GTAGAAGTCG TGCGCGGCGT GGCGGGGCTG CTGGACGGCA CGGGCGAGCC 

501 TTCCGCCACC GTCAATCTGG TGCGCAAACG CCTGACCCGC AAGCCATTGT 

551 TTGAAGTCCG CGCCGAAGCG GGCAACCGCA AACATTTCGG GCTGGACGCG 

601 GACGTATCGG GCAGCCTGAA CACCGAAGGC ACGCTGCGCG GCCGCCTGGT 

651 TTCCACCTTC GGACGCGGCG ACTCGTGGCG GCGGCGCGAA CGCAGCCGCG 

701 ATGCCGAACT CTACGGCATT TTGGAATACG ACATCGCACC GCAAACCCGC 

751 GTCCACGCAG GCATGGACTA CCAGCAGGCG AAAGAAACCG CCGACGCGCC 

801 GCTCAGCTAC GCCGTGTACG ACAGCCAAGG TTATGCCACC GCCTTCGGCC 

851 CGAAAGACAA CCCCGCCACA AATTGGGCGA ACAGCCGCCA CCGTGCGCTC 

901 AACCTGTTCG CCGGCATCGA ACACCGCTTC AACCAAGACT GGAAACTCAA 

951 AGCCGAATAC GACTACACCC GCAGCCGCTT CCGCCAGCCC TACGGCGTAG 

1001 CAGGCGTGCT TTCCATCGAC CACAACACCG CCGCCACCGA CCTGATTCCC 

1051 GGTTATTGGC ACGCCGACCC GCGCACCCAC AGCGCCAGCG TGTCATTGAT 

1101 CGGCAAATAC CGCCTGTTCG GCCGCGAACA CGATTTAATC GCGGGTATCA 

1151 ACGGTTACAA ATACGCCAGC" AACAAATACG GCGAACGCAG CATCATCCCC 

12 01 AACGCCATTC CCAACGCCTA CGAATTTTCC CGCACGGGTG CCTACCCGCA 

12 51 GCCTGCATCG TTTGCCCAAA CCATCCCGCA ATACGGCACC AGGCGGCAAA 

13 01 TCGGCGGCTA TCTCGCCACC CGTTTCCGCG CCGCCGACAA CCTTTCGCTG 
1351 ATTTTGGGCG GACGATACAC CCGTTACCGC ACCGGCAGCT ACGACAGCCG 

14 01 CACACAAGGC ATGACCTATG TGTCCGCCAA CCGTTTCACC CCCTACACAG 
14 51 GCATCGTGTT CGACCTGACC GGCAACCTGT CTCTTTACGG CTCGTACAGC 
1501 AGCCTGTTCG TCCCGCAATC GCAAAAAGAC GAACACGGCA GCTACCTGAA 
1551 ACCCGTAACC GGCAACAATC TGGAAGCCGG CATCAAAGGC GAATGGCTTG 
1601 AAGGCCGTCT GAACGCATCC GCCGCCGTGT ACCGCGCCCG TAAAAACAAC 
1651 CTCGCCACCG CAGCAGGACG CGACCCGAGC GGCAACACCT ACTACCGCGC 
1701 CGCCAACCAA GCCAAAACCC ACGGCTGGGA AATCGAAGTC GGCGGCCGCA 
1751 TCACGCCCGA ATGGCAGATA CAGGCAGGTT ACAGCCAAAG CAAAACCCGC 
1801 GACCAAGACG GCAGCCGCCT GAACCCCGAC AGCGTACCCG AACGCAGCTT 
1851 CAAACTCTTC ACTGCCTACC ACTTTGCCCC CGAAGCCCCC AGCGGCTGGA 
1901 CCATCGGCGC AGGCGTGCGC TGGCAGAGCG AAACCCACAC CGACCCTGCC 
1951 ACGCTCCGCA TCCCCAACCC CGCCGCCAAA GCCCGCGCCG CCGACAACAG 
2 001 CCGCCAAAAA GCCTACGCCG TCGCCGACAT CATGGCGCGT TACCGCTTCA 
2 051 ATCCGCGCGC CGAACTGTCG CTGAACGTGG ACAATCTGTT CAACAAACAC 
2101 TACCGCACCC AGCCCGACCG CCACAGCTAC GGCGCACTGC GGACAGTGAA 
2151 CGCGGCGTTT ACCTATCGGT TTAAATAA 

This corresponds to the amino acid sequence [<SEQ ID 666; ORF23-l>] (SEP ID NO: 666; 
PRF23-1): 



1 MTRFKYSLLF AALLPVYAQA DVSVSDDPKP QESTELPTIT VTADRTASSN 
51 DGYTVSGTHT PLGLPMTLRE IPQSVSVITS QQMRDQNIKT LDRALLQATG 
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101 TSRQIYGSDR AGYNYLFARG SRIANYQING I PVADALADT GNANTAAYER 

151 VEWRGVAGL LDGTGEPSAT VNLVRKRLTR KPLFEVRAEA GNRKHFGLDA 

2 01 DVSGSLNTEG TLRGRLVSTF GRGDSWRRRE RSRDAELYGI LEYDIAPQTR 

2 51 VHAGMDYQQA KETADAPLSY AVYDSQGYAT AFGPKDNPAT NWANS RHRAL 
301 NLFAGIEHRF NQDWKLKAEY DYTRSRFRQP YGVAGVLSID HNTAATDLIP 

3 51 GYWHADPRTH SASVSLIGKY RLFGREHDLI AGINGYKYAS NKYGERS IIP 
401 NAIPNAYEFS RTGAYPQPAS FAQTIPQYGT RRQIGGYLAT RFRAADNLSL 
451 I LGGRYTRYR TGSYDSRTQG MTYVSANRFT PYTGIVFDLT GNLSLYGSYS 
501 SLFVPQSQKD EHGSYLKPVT GNNLEAGIKG EWLEGRLNAS AAVYRARKNN 
551 LATAAGRDPS GNTYYRAANQ AKTHGWEIEV GGRITPEWQI QAGYSQSKTR 
601- DQDGSRLNPD SVPERSFKLF TAYHFAPEAP SGWTIGAGVR WQSETHTDPA 
651 TLRIPNPAAK ARAADNSRQK A Y A VAD I MAR YRFNPRAELS LNVDNLFNKH 
701 YRTQPDRHSY GALRTVNAAF TYRFK* 

1 5 Computer analysis of this amino acid sequence gave the following results: 



Homology with the ferric-pseudobactin receptor PupB of Pseudomonas putida (accession number 
P38047) (SEP ID NO: 1154) 



ORF23 (SEP ID NO: 664) and PupB protein (SEP ID NO: 1154) show 32% aa identity in 205aa 
overlap: 



20 


Orf23 


6 


FARGSRIANYQINGIPVADALADTGNANTAAYERVEWRGVAGLLDGTGEPSATVNLVRK 


65 








++RG I NY+++G+P + L D + + A ++RVE+VRG GL+ G G PSAT+NL+RK 






PupB 


215 


WSRGFAIQNYEVDGVPTSTRL-DNYSQSMAMFDRVEIVRGATGLISGMGNPSATINLIRK 


273 




Orf23 


66 


RLTRKPLFEVRAEAGNRKHFGLDADVSGSLNTEXXLRGRLVSTFXXXXXXXXXXXXXXAE 


125 








R T + + EAGN +G DVSG L +RGR V+ + 




25 


PupB 


274 


RPTAEAQASITGEAGNWDRYGTGFDVSGPLTETGNIRGRFVADYKTEKAWIDRYNQQSQL 


333 




Orf23 


126 


LYGILEYDI APQTRVHAXMDYQQAKETADAPLSYAVYD- - SQGYATAFGPKDNPATNWAN 


183 








+YGI E+D++ T + Y + D+PL + S G T N A +W+ 






PupB 


334 


MYG I TEFDLS EDTLLTVGFS Y - - LRSD I DS PLRSGLPTRFSTGERTNLKRSLNAAPDWS Y 


391 




Orf23 


184 


SHHRALNLFAGI EHRFNQDWKLKAE 208 




30 






+ H +FIE+ WKE 






PupB 


392 


NDHEQTSFFTSIEQQLGNGWSGKIE 416 





Homology with a predicted PRF from N. meningitidis (strain A) 



PRF23 (SEP ID NP: 664) shows 95.7% identity over a 211aa overlap with an PRF (PRF23a) 
(SEP ID NP: 668) from strain A of N. meningitidis: 



35 10 20 30 

orf23 pep GYNYLFARGSRI ANYQINGI PVADALADTG 

I I M I I I I I I I I I I M I II I I I M I I I I 
orf23a. QMRDQN I KALDRALLQATGTSRQ I YGS DRAG YNYLFARGSR I ANYQINGI PVADALADTG 

90 100 110 120 130 140 



5 

10 



40 



40 50 60 70 80 90 

orf23 .pep NANTAAYERVEWRGVAGLLDGTGEPSATVNLVRKRLTRKPLFEVRAEAGNRKHFGLDAD 
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ill! 1 1 M 1 1 M 1 1 1 1 1 1 1 1 II Ill III I Ml II II MM Ml I II 

or f 2 3 a ' NANTAAYERVEWRGVAGLLDGTGEPS ATVNLVRKRPTRKPLFEVRAEAGNRKHFGLGAD 

150 160 170 180 190 200 

100 110 120 130 140 150 

orf 23 . pep VSGS LNTEXXLRGRLVSTFGRGDSWRRRERSRXAELYG I LE YD I APQTRVHAXMDYQQAK 

IIMIhl : M i M M 1 1 1 1 1 1 1 M : ! I i I IIIIIIIIIIIIIIIIMI lllllll 

orf 23a VSGSLNAEGTLRGRLVSTFGRGDSWRQRERSRDAELYGILEYDIAPQTRVHAGMDYQQAK 
210 220 230 240 250 260 

160 170 180 190 200 210 

orf 23 . pep ETADAPLSYAVYDSQGYATAFGPKDNPATNWANSHHRALNLFAGIEHRFNQDWKLKAEYD 

Mill MMMIIM MIIIMMMMIMMMIMM MMIMIMII III 

orf 23a ETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRALNLFAGIEHRFNQDWKLKAEYD 
270 280, 290 300 310 320 



orf 23 .pep 
orf 23a 



YTRSRFRQPYGVAGVLS IDHNTAATDLI PGYWHADPRTHSASVSLIGKYRLFGREHDLI A 
330 340 350 360 370. 380 



The complete length ORF23a nucleotide sequence [<SEQ ID 667>] (SEP ID NO: 667) is 



1 ATGACACGCT TCAAATATTC CCTGCTGTTT GCCGCCCTGT TGCCCGTGTA 

51 CGCGCAGGCC GATGTTTCTG TTTCAGACGA CCCAAAACCG CAGGAAAGCA 

101 CTGAATTGCC GACCATCACC GTTACCGCCG ACCGCACCGC GAGTTCCAAC 

151 GACGGCTACA CTGTTTCCGG CACGCACACC CCGCTCGGGC TGCCCATGAC 

201 CCTGCGCGAA ATCCCGCAGA GCGTCAGCGT CATCACATCG CAACAAATGC 

251 GCGACCAAAA CATCAAAGCG CTCGACCGCG CCCTGTTGCA GGCGACCGGC 

301 ACCAGCCGCC AGATTTACGG CTCCGACCGC GCGGGCTACA ACTACCTGTT 

3 51 CGCGCGCGGC AGCCGCATCG CCAACTACCA AATCAACGGC ATCCCCGTTG 

4 01 CCGACGCGCT GGCCGATACG GGCAATGCCA ACACCGCCGC CTATGAGCGC 
4 51 GTAGAAGTCG TGCGCGGCGT GGCGGGGCTG CTGGACGGCA CGGGCGAGCC 
501 TTCCGCCACC GTCAATCTGG TGCGCAAACG CCCGACCCGC AAGCCATTGT 
551 TTGAAGTCCG CGCCGAAGCG GGCAACCGCA AACATTTCGG GCTGGGCGCG 
601 GACGTATCGG GCAGCCTGAA TGCCGAAGGC ACGCTGCGCG GCCGCCTGGT 
651 TTCCACCTTC GGACGCGGCG ACTCGTGGCG GCAGCGCGAA CGCAGCCGCG 
701 ATGCCGAACT CTACGGCATT TTGGAATACG ACATCGCACC GCAAACCCGC 
751 GTCCACGCAG GCATGGACTA CCAGCAGGCG AAAGAAACCG CCGACGCGCC 
801 GCTCAGCTAC GCCGTGTACG ACAGCCAAGG TTATGCCACC GCCTTCGGCC 
851 CGAAAGACAA CCCCGCCACA AATTGGGCGA ACAGCCGCCA CCGTGCGCTC 
901 AACCTGTTCG CCGGCATCGA ACACCGCTTC AACCAAGACT GGAAACTCAA 
951 AGCCGAATAC GACTACACCC GCAGCCGCTT CCGCCAGCCC TACGGCGTAG 

1001 CAGGCGTGCT TTCCATCGAC CACAACACCG CCGCCACCGA CCTGATTCCC 

1051 GGTTATTGGC ACGCCGACCC GCGCACCCAC AGCGCCAGCG TGTCATTAAT 

1101 CGGCAAATAC CGCCTGTTCG GCCGCGAACA CGATTTAATC GCGGGTATCA 

1151 ACGGTTACAA ATACGCCAGC AACAAATACG GCGAACGCAG CATCATCCCC 

12 01 AACGCCATTC CCAACGCCTA CGAATTTTCC CGCACGGGTG CCTACCCGCA 

12 51 GCCTGCATCG TTTGCCCAAA CCATCCCGCA ATACGGCACC AGGCGGCAAA 

1301 TCGGCGGCTA TCTCGCCACC CGTTTCCGCG CCGCCGACAA CCTTTCGCTG 

1351 ATACTCGGCG GCAGATACAG CCGTTACCGC ACCGGCAGCT ACGACAGCCG 

14 01 CACACAAGGC ATGACCTATG TGTCCGCCAA CCGTTTCACC CCCTACACAG 

14 51 GCATCGTGTT CGACCTGACC GGCAACCTGT CGCTTTACGG CTCGTACAGC 

1501 AGCCTGTTCG TCCCGCAATC GCAAAAAGAC GAACACGGCA GCTACCTGAA 

1551 ACCCGTAACC GGCAACAATC TGGAAGCCGG CATCAAAGGC GAATGGCTTG 

16 01 AAGGCCGTCT GAACGCATCC GCCGCCGTGT ACCGCGCCCG TAAAAACAAC 

1651 CTCGCCACCG CAGCAGGACG CGACCCGAGC GGCAACACCT ACTACCGCGC 

1701 CGCCAACCAA GCCAAAACCC ACGGCTGGGA AATCGAAGTC GGCGGCCGCA 
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1751 TCACGCCCGA ATGGCAGATA CAGGCAGGTT ACAGCCAAAG CAAAACCCGC 

1801 GACCAAGACG GCAGCCGCCT GAACCCCGAC AGCGTACCCG AACGCAGCTT 

1851 CAAACTCTTC ACTGCCTACC ACTTTGCCCC CGAAGCCCCC AGCGGCTGGA 

1901 CCATCGGCGC AGGCGTGCGC TGGCAGAGCG AAACCCACAC CGACCCTGCC 

5 1951 ACGCTCCGCA TCCCCAACCC CGCCGCCAAA GCCCGCGCCG CCGACAACAG 

2001 CCGCCAAAAA GCCTACGCCG TCGCCGACAT CATGGCGCGT TACCGCTTCA 

2051 ATCCGCGCGC CGAACTGTCG CTGAACGTGG ACAATCTGTT CAACAAACAC 

2101 TACCGCACCC AGCCCGACCG CCACAGCTAC GGCGCACTGC GGACAGTGAA 

2151 CGCGGCGTTT ACCTATCGGT TTAAATAA 

10 

This encodes a protein having amino acid sequence [<SEQ ID 668>] (SEP ID NO: 668) : 

1 MTRFKYSLLF AALLPVYAQA DVSVSDDPKP QESTELPTIT VTADRTASSN 

51 DGYTVSGTHT PLGLPMTLRE IPQSVSVITS QQMRDQNIKA LDRALLQATG 

101 TSRQIYGSDR AGYNYLFARG SRIANYQING I PVADALADT GNANTAAYER 

15 151 VEWRGVAGL LDGTGEPSAT VNLVRKRPTR KPLFEVRAEA GNRKHFGLGA 

201 DVSGSLNAEG TLRGRLVSTF GRGDSWRQRE RSRDAELYGI LEYDIAPQTR 

2 51 VHAGMDYQQA KETADAPLSY AVYDSQGYAT AFGPKDNPAT NWANSRHRAL 
301 NLFAGIEHRF NQDWKLKAEY DYTRSRFRQP YGVAGVLS ID HNTAATDLIP 

3 51 GYWHADPRTH SASVSLIGKY RLFGREHDLI AGINGYKYAS NKYGERS IIP 
20 4 01 NAIPNAYEFS RTGAYPQPAS FAQTIPQYGT RRQIGGYLAT RFRAADNLSL 

451 ILGGRYSRYR TGSYDSRTQG MTYVSANRFT PYTGIVFDLT GNLSLYGSYS 

501 SLFVPQSQKD EHGSYLKPVT GNNLEAGIKG EWLEGRLNAS AAVYRARKNN 

551 LATAAGRDPS GNTYYRAANQ AKTHGWEIEV GGRITPEWQI QAGYSQSKTR 

.601 DQDGSRLNPD SVPERSFKLF TAYHFAPEAP SGWTIGAGVR WQSETHTDPA 

25 651 TLRIPNPAAK ARAADNSRQK AYAVADIMAR YRFNPRAELS LNVDNLFNKH 

701 YRTQPDRHSY GALRTVNAAF TYRFK* 

ORF23a (SEP ID NO: 668) and ORF23-1 (SEP ID NO: 666) show 99.2% identity in 725 aa 
overlap: 



30 



10 20 30 40 50 60 

orf 23a . pep MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT 

! I I I I I I I I I I I I I i I I I I I II I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
orf 23 - 1 MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT 

10 20 30 40 50 60 



35 



70 80 90 100 110 120 

orf 23a . pep PLGLPMTLREIPQSVSVITSQQMRDQNIKALDRALLQATGTSRQIYGSDRAGYNYLFARG 

I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M M I I M I U I II ! 1 1 1 II 1 1 1 1 1 1 

orf 23 - 1 PLGLPMTLREIPQSVSVITSQQMRDQNIKTLDRALLQATGTSRQI YGSDRAGYNYLFARG 

70 80 90 100 110 120 



40 



130 140 150 160 170 180 

orf 23a . pep SRIANYQING I PVADALADTGNANTAAYERVEWRGVAGLLDGTGE PS ATVNLVRKRPTR 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
orf 23 - 1 SR I ANYQ ING I PVADALADTGNANTAAYERVE WRGVAGLLDGTGE PS ATVNLVRKRLTR 

130 140 150 160 170 180 



45 



190 200 210 220 230 240 

orf 23a . pep KPLFEVRAEAGNRKHFGLGADVSGSLNAEGTLRGRLVSTFGRGDSWRQRERSRDAELYGI 

Illlllllllllllllll llllll UIIMIIMMIIIIIIih IIIIMMIII 

orf 23 - 1 KPLFEVRAEAGNRKHFGLDADVSGSLNTEGTLRGRLVSTFGRGDSWRRRERSRDAELYGI 

190 200 210 220 230 240 



50 



orf 23a .pep 



250 260 270 280 290 300 

LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRAL 
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ii e i ; 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 ; 1 1 ! 1 1 1 1 

LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRAL 
250 260 270 280 290 300 



310 320 330 340 350 360 

orf 23a . pep NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTH 

II II MM I II Ml 1 1 Ml IMMIMI II IIIIIIIMI III IMIMI I IIMI 

orf 23 - 1 NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTH 

310 320 330 340 350 360 



10 



370 380 390 400 410 420 

orf 23a . pep SASVSLIGKYRLFGREHDLIAGINGYKYASNKYGERS I I PNAI PNAYEFSRTGAYPQPAS 

MMMMIMM Ml IMMMMMMMMMMIMMMMM IMIIIM 

orf 23-1 SASVSL I GKYRLFGREHDL I AG I NGYKYASNKYGERS I I PNAI PNAYEFSRTGAYPQPAS 

370 380 390 400 410 420 



15 



430 440 450 460 470 480 

orf 23a . pep FAQTIPQYGTRRQIGGYLATRFRAADNLSLILGGRYSRYRTGSYDSRTQGMTYVSANRFT 

IMMMMIM MMMIMMMMMMMMMMMMMIMM I Mill 

orf 23 - 1 FAQT I PQYGTRRQ I GGYLATRFRAADNLSL I LGGRYTRYRTGS YDSRTQGMTYVS ANRFT 

430 440 450 460 470 480 



20 



490 500 510 520 530 540 

orf 23a . pep PYTGIVFDLTGNLSLYGSYSSLFVPQSQKDEHGSYLKPVTGNNLEAGIKGEWLEGRLNAS 

I IM 1 1 1 1 1 1 II II M I M II i 1 1 1 1 1 M M 1 1 1 1 II M II 1 1 1 1 M 1 1 1 M I II I 

orf 23 - 1 PYTGIVFDLTGNLSLYGSYSSLFVPQSQKDEHGSYLKPVTGNNLEAGIKGEWLEGRLNAS 

490 500 510 ' ' 520 530 540 



25 



550 560 570 580 590 600 

orf 23a . pep AAVYRARKNNLATAAGRDPSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKTR 

MMMIMMMMMMI MMMMMIMMMMMMMMIMM I II 

orf 23 - 1 AAVYRARKNNLATAAGRDPSGNTYYRAANQAKTHGWE I EVGGR I TPEWQ IQAGYSQS KTR 

550 560 570 580 590 600 



30 



610 620 630 640 650 660 

orf 23a . pep DQDGSRLNPDSVPERSFKLFTAYHFAPEAPSGWTIGAGVRWQSETHTDPATLRIPNPAAK 

MIIIIMMIIIIMIMM MIIIIIIMIIIMIIIIIMIIIIIIMIIM I 

orf 23 - 1 DQDGSRLNPDSVPERSFKLFTAYHFAPEAPSGWTIGAGVRWQSETHTDPATLRIPNPAAK 

610 620 630 640 650 660 



35 



670 680 690 700 710 720 

orf 23a . pep ARAADNSRQKAYAVADIMARYRFNPRAELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF 

1 1 1 11 ! 1 1 1 M 1 1 i II 1 1 1 M N Mill 1 1 1 1 M I MINI I : I 

orf 23 - 1 ARAADNSRQKAYAVADIMARYRFNPRAELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF 

670 680 690 700 '710 720 



40 orf 23a. pep TYRFKX 

MINI 

orf23-l TYRFKX 

Homology with a predicted ORF from N. gonorrhoeae 



ORF23 (SEP ED NO: 664) shows 93.4% identity over a 211aa overlap with a predicted ORF 
45 (ORF23.ng) (SEP ID NO: 670) from N. gonorrhoeae: 
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orf23 .pep G YNYL FARGS R I ANYQ I NG I P VADALADTGNANTAAYERVE WRGVAGLLD 51 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
orf23ng SAVDACR IPG YNYL FARGS R I ANYQ I NG IP VADALADTGNANTAAYERVE WRGVAGLPD 60 

orf 23 . pep GTGEPSATVNLVRKRLTRKPLFEVRAEAGNRKHFGLDADVSGSLNTEXXLRGRLVSTFGR 111 

IIIIIIIMIIIIh IIIIIIMIIIMIMIIII lllllllhl :|||IMIIIII 
orf23ng GTGEPSATVNLVRKHPTRKPLFEVRAEAGNRKHFGLGADVSGSLNAEGTLRGRLVSTFGR 12 0 

orf 2 3 .pep GDSWRRRERSRXAELYGILEYDIAPQTRVHAXMDYQQAKETADAPLSYAVYDSQGYATAF 171 

llllh MM II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M i I M 1 1 1 1 1 1 1 1 1 M 111 I i 1 1 1 1 

orf23ng GDSWRQLERSRDAELYGILEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAF 180 

orf 23. pep GPKDNPATNWANSHHRALNLFAGIEHRFNQDWKLKAEYDY 211 

II I II II II h I h M II 1 1 1 1 II 1 1 1 M I M 1 1 II 1 1 1 1 

orf23ng GPKDNPATNWSNSRNRALNLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHS 24 0 

The ORF23ng nucleotide sequence [<SEQ ID 669>] (SEP ID NO: 669) is predicted to encode a 
protein comprising amino acid sequence [<SEQ ID 670>] (SEP ID NO: 670) : 

1 SAVDACRIPG YNYL FARGS R I ANYQ I NG IP VADALADTGN ANTAAYERVE 

51 WRGVAGLPD GTGEPSATVN LVRKHPTRKP LFEVRAEAGN RKHFGLGADV 

101 SGSLNAEGTL RGRLVSTFGR GDSWRQLERS RDAELYGILE YDIAPQTRVH 

151 AGMDYQQAKE TADAPLSYAV YDSQGYATAF GPKDNPATNW SNSRNRALNL 

201 FAGIEHRFNQ DWKLKAEYDY TRSRFRQPYG VAGVLSIDHS TAATDLIPGY 

251 WHADPRTHSA SMSLTGKYRL FGREHDLIAG INGYKYASNK YGERS I I PNA 

301 IPNAYEFSRT GAYPQPSSFA QTIPQYDTRR QIGGYLATRF RAADNLSLIL 

351 GGRYSRYRAG SYNSRTQGMT YVSANRFTPY TGIVFDLTGN LSLYGSYSSL 

401 FVPQLQKDEH GSYLKPVTGN NLEAD I KGEW LEGRLNASAA VYRARKNNLA 

451 TAAGRDQSGN TYYRAANQAK THGWEIEVGG RITPEWQIQA GYSQSKPRDQ 

501 DGSRLNPDSV PERSFKLFTA YHLAPEAPSG RTIGAGVRRQ GETHTDPAAL 

551 RIPNPAAKAR AVANSRQKAY AVADIMARYR FNPRTELSLN VDNLFNKHYR 

601 TQPDRHSYGA LRTVNAAFTY RFK* 

Further work revealed the complete nucleotide sequence [<SEQ ID 67 1>] (SEP ID NP: 671) : 

1 ATGACACGCT TCAAATACTC CCTGCTTTTT GCCGCCCTGC TACCCGTGTA 

51 CGCGCAGGCC GATGTTTCTG TTTCAGACGA CCCCAAACCG CAGGAAAGCA 

101 CCGAATTGCC GACCATCACC GTTACCGCCG ACCGCACCGC GAGTTCCAAC 

151 GACGGCTACA CCGTTTCCGG CACGCACACC CCGTTGGGGC TGCCCATGAC 

201 CCTGCGCGAA ATCCCGCAGA GCGTCAGCGT CATCACATCG CAACAAATGC 

251 GCGACCAAAA CATCAAAACG CTCGACCGCG CCCTGTTGCA GGCGACCGGC 

301 ACCAGCCGCC AGATTTACGG CTCCGACCGC GCGGGCTACA ACTACCTGTT 

351 CGCGCGCGGC AGCCGCATCG CCAACTACCA AATCAACGGC ATCCCCGTTG 

4 01 CCGACGCGCT GGCCGATACG GGCAATGCCA ACACCGCCGC CTATGAGCGC 

451 GTAGAAGTCG TGCGCGGCGT GGCGGGGCTG CCGGACGGCA CGGGCGAGCC 

501 TTCTGCCACC GTCAATCTGG TACGCAAACA CCCGACCCGC AAGCCATTGT 

551 TTGAAGTCCG CGCCGAAGCC GGCAACCGCA AACATTTCGG GCTGGGCGCG 

601 GACGTATCGG GCAGCCTGAA CGCCGAAGGC ACGCTGCGCG GCCGCCTGGT 

651 TTCCACCTTC GGACGCGGCG ACTCGTGGCG GCAGCTCGAA CGCAGCCGCG 

701 ATGCCGAACT CTACGGCATT TTGGAATACG ACATCGCACC GCAAACCCGC 

751 GTCCACGCAG GCATGGACTA CCAGCAGGCG AAAGAAACCG CAGACGCGCC 

801 GCTCAGCTAC GCCGTGTACG ACAGCCAAGG TTATGCCACC GCCTTCGGCC 

851 CAAAAGACAA CCCCGCCACA AATTGGTCGA ACAGCCGCAA CCGTGCGCTC 

901 AACCTGTTCG CCGGCATAGA ACACCGCTTC AACCAAGACT GGAAACTCAA 

951 AGCCGAATAC GACTACACCC GTAGCCGCTT CCGCCAGCCC TACGGTGTGG 

1001 CAGGCGTACT TTCCATCGAC CACAGCACTG CCGCCACCGA CCTGATTCCC 

1051 GGTTATTGGC ACGCcgatcc GCGCACCCAC AGCGCCAGCA TGTCATTGAC 
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1101 CGGCAAATAC CgcctGTTCG GCCGCGAGCA CGATTTAATC GCGGGTATCA 

1151 ACGGCTACAA ATACGCCAGC AACAAATACG GCGAACGCAG CATCATTCCC 

1201 AACGCCATTC CCAACGCCTA CGAATTTTCC CGCACGGGCG CCTATCCGCA 

1251 GCCATCATCG TTTGCCCAAA CCATCCCGCA ATACGACACC AGGCGGCAAA 

1301 TCGGCGGCTA TCTCGCCACC CGTTTCCGCG CCGCCGACAA CCTTTCGCTG 

1351 ATACTCGGCG GCAGATACAG CCGCTACCGC GCAGGCAGCT ACAACAGCCG 

14 01 CACACAAGGC ATGACCTATG TGTCCGCCAA CCGTTTCACC CCCTACACAG 

14 51 GCATCGTGTT CGATCTGACC GGCAACCTGT CGCTTTACGG CTCGTACAGC 

1501 AGCCTGTTCG TCCCGCAATT GCAAAAAGAC GAACACGGCA GCTACCTGAA 

1551 ACCCGTAACC GGCAACAATC TGGAAGCCGA CATCAAAGGC GAATGGCTTG 

1601 AAGGGCGTCT GAACGCATCC GCCGCCGTGT ACCGCGCCCG TAAAAACAAC 

1651 CTCGCCACCG CAGCAGGACG CGACCAGAGC GGCAACACCT ACTATCGCGC 

1701 CGCCAACCAA GCCAAAACCC ACGGCTGGGA AATCGAAGTC GGCGGCCGCA 

1751 TCACGCCCGA ATGGCAGATA CAGGCAGGCT ACAGCCAAAG CAAACCCCGC 

1801 GACCAAGACG GCAGCCGCCT GAACCCCGAC AGCGTAcCCG AACGCAGCTT 

1851 CAAACTCTTC ACCGCCTACC ACTTAGCCCC CGAAGCCCCC AGCGGCCGGA 

1901 CCATcggTGC GGGTGTGCGC CGGCAGGGCG AAACCCACAC CGACCCAGCC 

1951 GCGCTCCGCA TCCCCAACCC CGCCGCCAAA GCCCGCGCCG TCGCCAACAG 

2001 CCGCCAGAAA GCCTACGCCG TCGCCGACAT CATGGCGCGT TACCGCTTCA 

2 051 ATCCGCGCAC CGAACTGTCG CTGAACGTGG ACAACCTGTT CAACAAACAC 

2101 TACCGCACCC AGCCCGACCG CCACAGCTAC GGCGCACTGC GGACAGTGAA 

2151 CGCGGCGTTT ACCTATCGGT TTAAATAA 



This corresponds to the amino acid sequence [<SEQ ID 672; ORF23ng-l>] fSEO ID NO: 672; 
PRF23ng-l) : 



1 MTRFKYSLLF AALLPVYAQA DVSVSDDPKP QESTELPTIT VTADRTASSN 

51 DGYTVSGTHT PFGLPMTLRE IPQSVSVITS QQMRDQNIKT LDRALLQATG 

101 TSRQIYGSDR AGYNYLFARG SRIANYQING I PVADALADT GNANTAAYER 

151 VEWRGVAGL PDGTGEPSAT VNLVRKHPTR KPLFEVRAEA GNRKHFGLGA 

201 DVSGSLNAEG TLRGRLVSTF GRGDSWRQLE RSRDAELYGI LEYDIAPQTR 

251 VHAGMDYQQA KETADAPLSY AVYDSQGYAT AFGPKDNPAT NWSNSRNRAL 

301 NLFAGIEHRF NQDWKLKAEY DYTRSRFRQP YGVAGVLSID HSTAATDLIP 

351 GYWHADPRTH SASMSLTGKY RLFGREHDLI AGINGYKYAS NKYGERS IIP 

401 NAIPNAYEFS RTGAYPQPSS FAQTIPQYDT RRQIGGYLAT RFRAADNLSL 

451 ILGGRYSRYR AGSYNSRTQG MTYVSANRFT PYTGIVFDLT GNLSLYGSYS 

501 SLFVPQLQKD EHGSYLKPVT GNNLEAD I KG EWLEGRLNAS AAVYRARKNN 

551 LATAAGRDQS GNTYYRAANQ AKTHGWEIEV GGRITPEWQI QAGYSQSKPR 

601 DQDGSRLNPD SVPERSFKLF TAYHLAPEAP SGRTIGAGVR RQGETHTDPA 

651 ALRIPNPAAK ARAVANSRQK AYAVADIMAR YRFNPRTELS LNYDNLFNKH 

701 YRTQPDRHSY GALRTVNAAF TYRFK* 



ORF23ng-l (SEP ID NO: 672) and ORF23-1 (SEP ID NO: 666) show 95.9% identity in 725 aa 
overlap: 



10 20 30 40 50 60 

orf 23 - 1 . pep MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT 

i M 1 1 II 1 1 1 1 ■ I M M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M II 1 1 1 1 1 II 1 1 1 1 ! M 1 1 1 1 1 

orf 23ng-l MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT 

10 20 30 40 50 60 

70 80 90 • 100 110 120 

orf 23 -1 .pep PLGLPMTLREIPQSVSVITSQQMRDQNIKTLDRALLQATGTSRQIYGSDRAGYNYLFARG 

MINIMUM Ml IIIIIIIIIMIIIIIIIIIIIIIMIIIMI I Mill 

orf23ng-l PFGLPMTLREIPQSVSVITSQQMRDQNIKTLDRALLQATGTSRQIYGSDRAGYNYLFARG 

70 80 90 100 110 120 
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130 140 150 160 170 180 

orf 23 - 1 . pep SRIANYQINGIPVADALADTGNANTAAYERVEWRGVAGLLDGTGEPSATVNLVRKRLTR 

IMIII lllllll IIMIMIIIIMIIIIIIIMMII II IIIIMI MIMh II 

or f 2 3ng- 1 SRIANYQINGI PVADALADTGNANTAAYERVEWRGVAGLPDGTGEPSATVNLVRKHPTR 

5 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 2 3 - 1 . pep KPLFEVRAEAGNRKHFGLDADVSGSLNTEGTLRGRLVSTFGRGDSWRRRERSRDAELYGI 

II Mill II I! II II II I I II II MIMI II I II MM II II M I II II 1 1 II 1 1 

orf23ng-l KPLFEVRAEAGNRKHFGLGADVSGSLNAEGTLRGRLVSTFGRGDSWRQLERSRDAELYGI 
10 190 200 210 220 230 240 



15 



250 260 270 280 290 300 

orf 23 - 1 . pep LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRAL 

I M I 1 1 1 1 1 1 I M 1 1 1 1 1 : 1 1 I M 1 1 1 1 1 1 1 II I M 1 1 1 1 1 1 1 : 1 1 M 1 1 : 1 1 h I 

or f 2 3ng- 1 LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWSNSRNRAL 

250 260 270 280 290 300 



20 



310 320 330 340 350 360 

orf 23 - 1 . pep NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTH 

1 1 1 II : II I M M 1 1 M I 1 1 1 M II 1 1 M I M 1 1 1 1 1 hll M 1 1 M 1 1 1 1 1 1 1'l 

orf23ng-l NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHSTAATDLIPGYWHADPRTH 

310 320 . 330 340 350 360 
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370 380 390 400 410 420 

orf 23-1 .pep SASVSLIGKYRLFGREHDLIAGINGYKYASNKYGERSIIPNAIPNAYEFSRTGAYPQPAS 

MM I II 1 1 1 M II M 1 1 II II 1 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M II II 1 1 1 1 II IM 

orf23ng-l SASMSLTGKYRLFGREHDLIAGINGYKYASNKYGERSIIPNAIPNAYEFSRTGAYPQPSS 

370 380 390 400 410 420 
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430 440 450 460 470 480 

orf 2 3 - 1 . pep FAQTIPQYGTRRQIGGYLATRFRAADNLSLILGGRYTRYRTGSYDSRTQGMTYVSANRFT 

IMIII 1 1 1 1 1 M II M M I II 1 1 1 M 1 1 1 M I M M I M I M I II 1 1 1 M II I II 

or f 2 3 ng - 1 FAQT I PQ YDTRRQ IGG YLATRFRAADNLS L I LGGR YS RYRAGS YNSRTQGMT YVS ANRFT 

430 440 450 460 470 480 
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490 500 510 520 530 540 

orf 23 - 1 . pep PYTGIVFDLTGNLSLYGSYSSLFVPQSQKDEHGSYLKPVTGNNLEAGIKGEWLEGRLNAS 

I I II M I II I M I II II I I I I I I I I I I I II M I II I I I I II II M I II Mill I I II I 
orf 23ng- 1 . PYTGIVFDLTGNLSLYGSYSSLFVPQLQKDEHGSYLKPVTGNNLEADIKGEWLEGRLNAS 

490 500 510 520 530 540 
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550 560 570 580 590 600 

orf 23 - 1 . pep AAVYRARKNNLATAAGRDPSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKTR 

1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 II II II II II I II M I M M I M 1 1 1 1 1 1 1 1 I 

orf23ng-l AAVYRARKNNLATAAGRDQSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKPR 

550 560 570 580 590 600 
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610 620 630 640 650 660 

orf 23 - 1 . pep DQDGSRLNPDSVPERSFiCLFTAYHFAPEAPSGWTIGAGVRWQSETHTDPATLRIPNPAAK 

. Ill MIMMMIM IMMMIMM lllllll I : I I I I I I I : I I I I I I I I I 
orf23ng-l DQDGSRLNPDSVPERSFKLFTAYHLAPEAPSGRTIGAGVRRQGETHTDPAALRIPNPAAK 

610 620 630 640 650 660 
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670 680 690 700 710 720 

orf 23 - 1 . pep ARAADNSRQKAYAVADIMARYRFNPRAELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF 

MM 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II Ml I M 1 1 1 II II 1 1 II I 1 1 1 1 1 1 1 1 1 1 M 1 1 

orf23ng-l ARAVANSRQKAYAVADIMARYRFNPRTELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF 

670 680 690 700 710 720 
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orf 23-1. pep • TYRFKX 
MM I 

orf 2 3 rig -1 TYRFKX 

In addition, ORF23ng-l fSEO ID NO: 672) shows significant homology with an OMP (SEP ID 
NO: 1155) from E.colt 



10 



15 



20 



25 



30 



35 



40 



sp| P16869 | FHUE_ECOLI OUTER - MEMBRANE RECEPTOR FOR FE ( III ) -COPROGEN, FE(III)- 
FERRIOXAMINE B AND FE ( III ) -RHODOTRULIC ACID PRECURSOR ) gi | 1651542 | gnl | PID | dl015403 
(D90745) Outer membrane protein FhuE precursor [Escherichia coli] 
)gi|l651545|gnl|PID|dl015405 (D90746) Outer membrane protein FhuE precursor 
[Escherichia coli] ) gi | 1787344 (AE000210) outer- membrane receptor for Fe(III)- 
coprogen, Fe (III) -f errioxamine B and Fe (III) -rhodotrulic acid precursor 
[Escherichia coli] Length = 729 
Score = 332 bits (843), Expect = 3e-90 

Identities = 228/717 (31%), Positives = 350/717 (48%), Gaps = 60/717 (8%) 

Query : 38 T I TVTADRTAS SN - -DGYTVSGTHTPFGLPMTLREIPQSVSVITSQQMRDQNIKTLDRAL 95 

T+ V TA + + Y+V+ T + MT R+IPQSV++++ Q+M DQ ++TL + 

Sbjct : 43 TVIVEGSATAPDDGENDYSVTSTSAGTKMQMTQRDIPQSVTIVSQQRMEDQQLQTLGEVM 102 



45 



Query: 
Sbjct: 
Query: 
Sbjct: 
Query : 
Sbjct: 
Query: 
Sbjct: 
Query: 
Sbjct: 
Query : 
Sbjct: 
Query: 
Sbjct: 
Query: 
Sbjct: 
Query: 
Sbjct: 



96 LQ ATGTS RQ I YGS DRAG YNYL F ARGS R I AN YQ I NG I P VADALADTGNANTAA 14 7 

G S+ SDRA Y ++RG +1 NY ++GIP + DAL+D A 

103 ENTLG I S KSQADSDRALY YSRGFQIDNYMVDGIPTYFESRWNLGDALSDM AL 154 

148 YERVEWRGVAGLPDGTGEPSATVNLVRKHPTRKPLF-EVRAEAGNRKHFGLGADVSGSL 206 

+ERVEWRG GL GTG PSA +N+VRKH T + +V AE G+ AD+ L 

155 FERVEVVRGATGLMTGTGNPSAAINMVRKHATSREFKGDVSAEYGSWNKERYVADLQSPL 214 

2 07 NAEGTLRGRLVSTFGRGDSWRQLERSRDAELYGILEYDIAPQTRVHAGMDYQQAKETADA 2 66 

+G +R R+V + DSW S GI++ D+ T + AG +YQ+ + 

215 TEDGKIRARIVGGYQNNDSWLDRYNSEKTFFSGIVDADLGDLTTLSAGYEYQRIDVNSPT 274 

267 PLSYAVYDSQGYATAFGPKDNPATNWSNSRNRALNLFAGIEHRFNQDWKLKAEYDYTRSR 326 

+++ G + ++ + A +W+ + +F ++ +F W+ ++ 

275 WGGLPRWNTDGSSNSYDRARSTAPDWAYNDKEINKVFMTLKQQFADTWQATLNATHSEVE 3 34 

327 F- -RQPYGVAGVLSIDHSTAA- -TDLIPGY WHADPRTHSA- SMSLTGKYRLFG 374 

F + Y A V . D ++ PG+ W++ R A + G Y LFG 

335 FDSKMMYVDAYVNKADGMLVGPYSNYGPGFDYVGGTGWNSGKRKVDALDLFADGSYELFG 3 94 

375 REHDL I AG I NGYKYASNKYGER - - S 1 1 PNA I PNAYEFSRTGAYPQPS S FAQT I PQYDTRR 432 

R+H+L+ G Y +N+Y +1 P+ I + Y F+ G +PQ Q++ Q DT 

395 RQHNLMFG-GSYSKQNNRYFSSWANIFPDEIGSFYNFN- -GNFPQTDWSPQSLAQDDTTH 451 

433 QIGGYLATRFRAADNLSLILGGRYSRYRAGSYNSRTQGMTY-VSANRFTPYTGIVFDXXX 491 

Y ATR AD L LILG RY+ +R + +TY + N TPY G+VFD 

4 52 MKS LYAATRVTLADPLHL I LGARYTNWRVDT LTYSMEKNHTTPYAGLVFDIND 504 

4 92 XXXXXXXXXXXFVPQLQKDEHGSYLKPVTGNNLEADIKGEWLEGRLNASAAVYRARKNNL 551 

F PQ +D G YL P+TGNN E +K +W+ RL + A++R ++N+ 

505 NWSTYASYTSIFQPQNDRDSSGKYLAPITGNNYELGLKSDWMNSRLTTTLAIFRIEQDNV 564 

552 ATAAGR DQSGNTYYRAANQAKTHGWE I EVGGR I TPEWQ I QAGYSQS KPRDQDGSRLN 608 

A + G +G T Y+A + + G E E+ G IT WQ+ G ++ D +G+ +N 

565 AQSTGTPIPGSNGETAYKAVDGTVSKGVEFELNGAITDNWQLTFGATRYIAEDNEGNAVN 624 



CHIR-01 60 (356.001 ) PATENT 

-499- 

Query: 609 PDSVPERSFKLFTAYHIAPEAPSGRTIGAGVRRQGETHTDPAALRIPNPAAKARAVANSR 668 

p + + p + K+FT+Y LP P T+G GV Q +TD P RA 
Sbjct: 625 P - NLPRTTVKMFTS YRL- PVMPE - LTVGGGVNWQNRVYTDTV TPYGTFRA E 672 

Query: 669 QKAYAVADIMARYRFNPRTELSLNVDNLFNKHYRTQPDRH-SYGALRTVNAAFTYRF 724 
5 Q +YA+ D+ RY+ L NV+NLF+K Y T + YG R. + TY+F 

Sbjct: 673 QGS YALVDLFTRYQVTKNFSLQGNVNNLFDKTYDTNVEGS I VYGTPRNFS I TGTYQF 729 

Based on this analysis, it was predicted that these proteins from N. meningitidis and N. gonorrhoeae, 
and their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

ORF23-1 (SEP ID NO: 666) (77.5kDa) was cloned in pET and pGex vectors and expressed in 
10 E.colU as described above. The products of protein expression and purification were analyzed by 
SDS-PAGE. Figure 15A shows the results of affinity purification of the His-fusion protein, and 
Figure 15B shows the results of expression of the GST- fusion in E.colL Purified His-fusion protein 
was used to immunise mice, whose sera were used for Western blot (Figure 15C) and for ELISA 
(positive result). These experiments confirm that ORF23-1 (SEP ID NO: 666) is a surface-exposed 
1 5 protein, and that it is a useful immunogen. 

Example 80 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 673>] (SEP ID 
NP: 673) : 

1 ATGCGCACGG CAGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC 

20 51 GGCAATGATG CCGGAAATGG TGTGCGCGGG CGTGTCGCCG GGAACGGCAA 

101 TCATATCCAA GCCGACCGAA CAAACGGCGG TCATGGCTTC GAGTTTGTCC 

151 AGCGTCAgcA CGCCTGCTTC GGCGgcGgCa ATCATACCTT CGTCTTCGGA 

201 AACGGGGATA AACGcGCCAC TCAAACCCCC GACCGCGCTG GAAGCCATCA 

251 TGCCGCCTTT TTTCAGGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG 

25 301 CCGTGCGTAC CGCAGACGCT CAAGCCCATT TnTTCAAGAA TGCGTGCCAC 

3 51 TnAGTCGCCG ACGGGG . . 

This corresponds to the amino acid sequence [<SEQ ID 674; PRF24>] (SEQ ID NP: 674; 
PRF24) : 

30 1 MRTAWLLLI MPMAASSAMM PEMVCAGVSP GTAIISKPTE QTAVMASSLS 

51 SVSTPASAAA IIPSSSETGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAW 
101 PCVPQTLKPI XSRMRATXSP TG. . 

Further work revealed the complete nucleotide sequence [<SEQ ID 675>] fSEP ID NP: 675): 



35 



1 ATGCGCACGG CAGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC 
51 GGCAATGATG CCGGAAATGG TGTGCGCGGG CGTGTCGCCG GGAACGGCAA 
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101 TCATATCCAA GCCGACCGAA CAAACGGCGG TCATGGCTTC GAGTTTGTCC 

151 AGCGTCAGCA CGCCTGCTTC GGCGGCGGCA ATCATACCTT CGTCTTCGGA 

201 AACGGGGATA AACGCGCCAC TCAAACCCCC GACCGCGCTG GAAGCCATCA 

251 TGCCGCCTTT TTTCACGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG 

301 CCGTGCGTAC CGCAGACGCT CAAGCCCATT TCTTCAAGAA TGCGTGCCAC 

351 TGAGTCGCCG ACGGCGGGGG TGGGCGCCAG CGACAAGTCG AGAATACCAA 

4 01 ACGGGATATT CAGCATTTTT GAGGCTTCGC GGCCGATGAG TTCGCCCACG 

451 CGGGTAATTT TGAAAGCAGT TTTCTTCACT ACTTCCGCAA CTTCGGTCAA 

501 TGTCGTTGCA TCTGAATTTT CCAACGCGGC TTTTACGACA CCTGGGCCGG 

551 ATACGCCGAC ATTGATAACG GCATCCGCTT CGCCCGAACC ATGAAACGCG 

601 CCCGCCATAA ACGGGTTGTC TTCCACCGCG TTGCAGAACA CGACAATTTT 

651 AGCGCAGCCG AAACCTTCGG GCGTGATTTC CGCCGTGCGT TTGACGGTTT 

701 CGCCCGCCAG CTTGACCGCA TCCATATTGA TACCGGCACG CGTACTGCCG 

751 ATATTGATGG AGCTGCACAC AATATCGGTA GTCTTCATCG CTTCGGGAAT 

801 GGAGCGGATT AACACCTCAT CCGAAGGCGA CATCCCTTTT TGCACCAACG 

851 CGGAAAAACC GCCGATAAAA GACACACCGA TGGCTTTGGC AGCTTTATCC 

901 AAAGTTTGCG CCACGCTGAC GTAA 

This corresponds to the amino acid sequence [<SEQ ID 676; ORF24-l>] (SEP ID NO: 676; 
ORF24-1) : 

1 MRTAWLLLI MPMAASSAM M PEMVCAGVSP GTAIISKPTE QTAVMASSLS 
51 SVSTPASAAA IIPSSSETGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAW 
101 PCVPQTLKPI SSRMRATESP TAGVGASDKS RIPNGIFSIF EASRPMSSPT 
151 RVILKAVFFT TSATSVNWA SEFSNAAFTT PGPDTPTLIT ASASPEP*NA 
201 PAINGLSSTA LQNTTILAQP KPSGVIS AVR LTVS PASLTA SILI PAR VLP 
251 ILMELHTISV VFIA SGMERI NTSSEGDIPF CTNAEKPPIK DTPMALAALS 
3 01 KVCATLT* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF24 (SEP ID NO: 674) shows 96.4% identity over a 307 aa overlap with an ORF (ORF24a) 
(SEP ID NO: 678) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 24a .pep MRTAWLLLIMPMAASSAMMPEMVCAGVSPGTAIISXPTEQTAVIASSLSNVSTPASAAA 

M I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II II MMIIH IIMIMMIII 

orf 24 MRTAWLLL I MPMAAS S AMMPEMVCAGVS PGTAI I S KPTEQTAVMAS SLS SVSTPASAAA 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 24a. pep IIPSSSXTGINAPLKPPTALEAIMPPFFTASFSNAKAAWPCVPQTLKP I SSRMRATESP 

IIIIM 1 1 1 1 1 1 M 1 1 1 1 1 , 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M I M 

orf 24 1 1 PSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAWPCVPQTLKP I SSRMRATESP 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 24a. pep TAGVGASDKS R I PNG I FS I FEASRPMSSPTRVILKAVFFTTSATSVNWASEFSNAAFTT 

IIIMIIIIIIIIIIIII lllllllllllllMIIIIIII I IIMIMMIllll 

orf 24 TAGVGASDKSRI PNGI FS I FEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT 

130 140 150 160 170 180 
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190 200 210 220 230 240 

orf 24a . pep PGPDTPTL ITASAS PEPXNAPAIXGLSSXALQNTT I LAQPKPSSVI SXVRLMVS PASLTA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I lllhllllMIIIMIIhlll Ml llllllll 

orf 24 PGPDTPTL I TASASPEPXNAPAINGLSSTALQNTT I LAQPKPSGV I SAVRLTVS PASLTA 

190 200 210 220 230 240 
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250 260 270 280 290 300 

orf 24a. pep SILIPARVLPILMELHTISWFIASGMERXNTSSEGDIPFCTSAEKPPIKDTPMALAALS 

Illllllllllllllllllllllllllll 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 li 1 1 1 1 

orf 24 SILIPARVLPILMELHTISWFIASGMERINTSSEGDIPFCTNAEKPPIKDTPMALAALS 

250 260 270 280 290 300 



or f 2 4 a . pep KVCATLTX 

llllllll 
orf 24 KVCATLTX 

15 

The complete length ORF24a nucleotide sequence [<SEQ ID 677>] (SEP ID NO: 677) is: 

1 ATGCGCACGG CAGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC 

51 GGCAATGATG CCGGAAATGG TGTGCGCGGG TGTGTCGCCG GGAACGGCAA 

101 TCATATCCAA NCCGACCGAA CAAACGGCGG TCATCGCTTC GAGTTTATCC 

20 151 AACGTCAGCA CGCCTGCTTC GGCGGCGGCA ATCATACCTT CGTCTTCGGA 

2 01 NACGGGGATA AACGCGCCAC TCAAACCGCC AACCGCGCTC GAAGCCATCA 

2 51 TGCCGCCCTT TTTCACGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG 
301 CCGTGCGTAC CGCAGACGCT CAAACCCATT TCTTCAAGAA TGCGCGCCAC 

3 51 CGAGTCGCCG ACGGCAGGGG TCGGTGCCAG CGACAAGTCG AGAATACCAA 
25 4 01 ACGGGATATT CAGCATTTTT " GAGGCTTCGC GGCCGATGAG TTCGCCCACG 

4 51 CGGGTAATTT TGAAGGCGGT TTTCTTCACA ACTTCGGCAA CTTCGGTCAA 
501 TGTCGTTGCA TCCGAATTT-T CCAACGCGGC TTTTACGACA CCCGGGCCGG 
551 ATACGCCGAC ATTAATCACA GCATCCGCTT CGCCTGAGCC GTGAAACGCG 
601 CCCGCCATAN ACGGGTTGTC TTCCNCCGCG TTGCAGAACA CGACGATTTT 

30 651 GGCGCAGCCG AAACCTTCTA GTGTGATTTC ANCCGTGCGT TTGATGGTTT 

701 CGCCCGCCAG TCTGACCGCG TCCATATTGA TACCGGCGCG CGTACTGCCG 

751 ATATTGATGG AGCTGCACAC GATATCAGTA GTCTTCATCG CTTCGGGAAT 

801 GGAACGGATN AACACCTCGT CAGAAGGCGA CATACCTTTT TGCACCAGCG 

851 CGGAAAAGCC GCCAATAAAA GACACGCCGA TGGCTTTGGC AGCCTTATCC 

35 901 AAAGTTTGCG CCACGCTGAC GTAA 

This encodes a protein having amino acid sequence [<SEQ ID 678>] (SEP ID NO: 678) : 

1 MRTAWLLLI MPMAASSAMM PEMVCAGVSP GTAIISXPTE QTAVIASSLS 

51 NVSTPASAAA IIPSSSXTGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAW 

40 101 PCVPQTLKPI SSRMRATESP TAGVGASDKS RIPNGIFSIF EASRPMSSPT 

151 RVILKAVFFT TSATSVNWA SEFSNAAFTT PGPDTPTLIT ASASPEP^NA 

201 PAIXGLSSXA LQNTTILAQP KPSSVISXVR LMVS PASLTA SILIPARVLP 

251 ILMELHTISV VFIASGMERX NTSSEGDIPF CTSAEKPPIK DTPMALAALS 

301 KVCATLT* 

45 

It should be noted that this protein includes a stop codon at position 198. 



ORF24a (SEP ID NO: 678) and ORF24-1 (SEP ID NO: 676) show 96.4% identity in 307 aa 
overlap: 
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10 20 30 40 50 60 

orf 24a. pep MRTAWLLLIMPMAASSAMMPEMVCAGVSPGTAIISXPTEQTAVIASSLSNVSTPASAAA 

1 1 1 1 M I II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 IMIIMIMMIIIII I 

orf 24-1 MRTAWLLLIMPMAASSAMMPEMVCAGVS PGTAIISKPTEQTAVMASSLSSVSTPASAAA 

10 20 30 40 50 60 
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70 80 90 100 110 120 

orf 24a. pep IIPSSSXTGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP 

Mil 1 1 1 II 1 1 1 1 1 1 1 Ih 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 

orf 24-1 IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAWPCVPQTLKPISSRMRATESP 

70 80 90 100 110 120 
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130 140 150 160 170 180 

orf 24a. pep TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT 
I M I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I M I I II I I I : I I 

or f 2 4 - 1 TAGVGASDKSRI PNG I FS I FEASRPMSS PTRVI LKAVFFTTSATSVNWASEFSNAAFTT 

130 140 150 160 170 180 
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190 200 210 220 230 240 

orf 24a . pep PGPDTPTLITASASPEPXNAPAIXGLSSXALQNTTILAQPKPSSVISXVRLMVSPASLTA 

lllllllllllllllllllllll MIM-MIIIIII IIMM ,1 llllllll 

orf 24 - 1 PGPDTPTLITASASPEPXNAPAINGLSSTALQNTTILAQPKPSGVISAVRLTVSPASLTA 

190 200 210 220 230 240 
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250 260 270 280 290 300 

orf 24a. pep SILIPARVLPILMELHTISVVFIASGMERXNTSSEGDIPFCTSAEKPPIKDTPMALAALS 

IIIIIIIIIIIIIIIMIIIIIIIIIIII lllllllll IhllllMIIIIIIII I 

orf 24-1 SILIPARVLPILMELHTISWFIASGMERINTSSEGDIPFCTNAEKPPIKDTPMALAALS 

250 260 270 280 290 300 



orf 24a .pep 
orf 24-1 



KVCATLTX 

llllllll 
KVCATLTX 



30 Homology with a predicted ORF from N. gonorrhoeae 

ORF24 (SEP ID NO: 674) shows 96.7% identity over a 121 aa overlap with a predicted ORF 
(ORF24ng) (SEP ID NO: 680) from N. gonorrhoeae: 
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orf 24 .pep MRTAWLLLIMPMAASSAMMPEMVCAGVS PGTAIISKPTEQTAVMASSLSSVSTPASAAA 60 

I I I I M I I I I I I I I ! I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I H I I I I I I 
orf 24ng MRT A WLLL I M PMAAS S AMM P EMVCAGVS PGT A I MS KPTEQTAVMAS S LS S VNT PAS AAA 60 
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orf 24 .pep IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAWPCVPQTLKPIXSRMRATXSP 120 

1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 IIMM II 

orf 24ng IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAWPCVPQTLKPISSRMRATESP 120 

orf 24. pep TG 122 

h 

orf 24ng TAGVGASDKSRMPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVRLTASEFSSAALTT 180 

The complete length ORF24ng nucleotide sequence [<SEQ ID 679>] (SEP ID NO: 679) is: 



1 ATGCGCACGG CGGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC 
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10 



15 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 



GGCGATGATG 
TCATGTCCAA 
AGCGTCAACA 
AACGGGGATA 
TGCCGCCCTT 
CCGTGCGTAC 
CGAGTCGCCG 
ACGGGATATT 
CGGGTGATTT 
GCTGACCGCG 
ATACGCCGAC 
CCCGCCATAA 
GGCGCAGCCG 
CGCCTGCCAG 
ATATTGATGG 
GGAACGGATC 
CGGAAAAGCC 
AAAGTCTGCG 



CCGGAAATGG 
ACCAACGGAG 
CGCCTGCCTC 
AACGCGCCGC 
TTTCACGGCA 
CGCAGACGCT 
ACGGCGGGGG 
CAGCATTTTT 
TGAAAGCGGT 
TCCGAATTTT 
ATTAATCACA 
ACGGATTGTC 
AAACCTTCGG 
CTTGACCGCA 
AGCTGCACAC 
AACACCTCAT 
GCCGATAAAG 
CCACGCTGAC 



TGTGCGCGGG 
CAGACGGCGG 
GGCGGCGGCA 
TCAAACCGCC 
TCGTTCAGCA 
CAAGCCCATT 
TCGGTGCCAG 
GAGGCTTCGC 
TTTCTTCACG 
CCAGCGCGGC 
GCATCCGCTT 
TTCCACCGCG 
GTGTGATTTC 
TCCATATTGA 
GATATCGGTA 
CCGAAGGCGA 
GACACGCCGA 
ATAA 



CGTGTCGCCG 
TCATGGCTTC 
ATCATACCTT 
GACCGCGCTG 
ATGCCAAAGC 
TCTTCAAGAA 
CGACAAATCG 
GACCGATGAG 
ACTTCGGCGA 
TTTGACCACG 
CGCCCGAGCC 
TTGCAGAACA 
AGCCGTGCGT 
TACCGGCACG 
GTTTTCATCG 
CATACCTTTT 
TGGCTTTGGC 



GGAACGGCAA 
GAGTTTGTCC 
CGTCTTCGGA 
GAAGCCATCA 
TGCTGTTGTG 
TGCGCGCCAC 
AGAATGCCGA 
TTCGCCCACG 
CCTCGGTCAG 
CCTGGACCGG 
GTGGAACGCA 
CGACGATTTT 
TTGATGGTTT 
CGTGCTGCCG 
CTTCGGGAAC 
TGCACCAGCG 
TGCCTTGTCC 



20 This encodes a protein having amino acid sequence [<SEQ ID 680>] (SEP ID NO: 680) : 



25 



1 MRTAWLLLI MPMAASSAM M 

51 SVNTPASAAA IIPSSSETGI 

101 PCVPQTLKPI SSRMRATESP 

151 RVILKAVFFT TSATSVRLTA 

201 PAINGLSSTA LQNTTILAQP 

251 ILMELHTISV VFIA SGTERI 

3 01 KVCATLT* 



PEMVCAGVSP 
NAPLKPPTAL 
TAGVGASDKS 
SEFSSAALTT 
KPSGVISAVR 



GTAIMSKPTE 
EAIMPPFFTA 
RMPNGIFSIF 
PGPDTPTLIT 
LMVS PASLTA 



QTAVMASSLS 
SFSNAKAAW 
EASRPMSSPT 
ASASPEPWNA 
SILIPARVLP 



NTSSEGDIPF CTSAEKPPIK DTPMALAALS 



30 



ORF24ng (SEP ID NO: 680) and ORF24-1 (SEP ID NO: 676) show 96.1% identity in 307 aa 
overlap: 



35 



10 20 30 40 50 60 

orf 24-1 .pep MRTAWLLLIMPMAASSAMMPEMVCAGVSPGTAIISKPTEQTAVMASSLSSVSTPASAAA 

MINIMI! IIIIIMIMIIIIIIIIIIIIhlllllMIIIM I IMIIIII 

orf24ng MRTAWLLLIMPMAASSAMMPEMVCAGVSPGTAIMSKPTEQTAVMASSLSSVNTPASAAA 

10 20 30 40 50 60 



40 



70 80 90 100 110 120 

orf 24-1. pep IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAWPCVPQTLKPISSRMRATESP 

II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M i 1 1 1 1 1 M 1 1 1 II 1 1 M 1 1 1 1 1 U II 1 1 1 1 M I II 

orf24ng 1 1 PSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAWPCVPQTLKPI SSRMRATESP 

70 80 90 100 110 120 



45 



130 140 150 160 170 180 

orf 24-1 .pep TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNWASEFSNAAFTT 

IMMM IMMIIIMM MINIMUM IMIMMII ::|||lhl|:|| 
orf24ng TAGVGASDKSRMPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVRLTASEFSSAALTT 

130 140 150 160 170 180 



50 



190 200 210 220 230 240 

orf 24-1 .pep PGPDTPTLITASASPEPXNAPAINGLSSTALQNTTILAQPKPSGVISAVRLTVSPASLTA 

II 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 i I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 llllllll 

orf24ng PGPDTPTLITASASPEPWNAPAINGLSSTALQNTTILAQPKPSGVISAVRLMVSPASLTA 

190 200 210 220 230 240 
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250 260 270 280 290 300 

orf 24 - 1 . pep SILIPARVLPILMELHTISWFIASGMERINTSSEGDIPFCTNAEKPPIKDTPMALAALS 

Illlllllllllllllllllllllll I I I I I I I I I I I I I I : I I I I I I I I I II I II I I 
orf24ng SILIPARVLPILMELHTISWFIASGTERINTSSEGDIPFCTSAEKPPIKDTPMALAALS 

250 260 270 280 290 300 



orf 24-1. pep KVCATLTX 
III III 

orf24ng KVCATLTX 

Based on this analysis, including the presence of a putative leader sequence (first 18 aa - double- 
underlined) and putative transmembrane domains (single-underlined) in the gonococcal protein, it 
is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 81 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ED 681 >] (SEP ID 
NO: 681) : 

1 . . ACCGACGTGC AAAAAGAGTT GGTCGGCGAA CAACGCAAGT GGGCGCAGGA 

51 - AAAAATCAGC AACTGCCGAC AAGCCGCCGC GCAGGCAGAC CGGCAGGAAT 

101 ACGCCGAATA CCTCAAGCTG CAATGCGACA CGCGGATGAC GCGCGAACGG 

151 ATACAGTATC TTCGCGGCTA TTCCATCGAT TAG 

This corresponds to the amino acid sequence [<SEQ ID 682; ORF25>] (SEP ID NO: 682; 
PRF25) : 

1 . . TDVQKELVGE QRKWAQEKIS NCRQAAAQAD RQEYAEYLKL QCDTRMTRER 
51 IQYLRGYSID * 

Further work revealed the complete nucleotide sequence [<SEQ ID 683>] (SEP ID NO: 683) : 

1 ATGTATCGGA AACTCATTGC GCTGCCGTTT GCCCTGCTGC TTGCCGCTTG 

51 CGGCAGGGAA GAACCGCCCA AGGCATTGGA ATGCGCCAAC CCCGCCGTGT 

101 TGCAAGGCAT ACGCGGCAAT ATTCAGGAAA CGCTCACGCA GGAAGCGCGT 

151 TCTTTCGCGC GCGAAGACGG CAGGCAGTTT GTCGATGCCG ACAAAATTAT 

2 01 CGCCGCCGCC TACGGTTTGG CGTTTTCTTT GGAACACGCT TCGGAAACGC 
251 AGGAAGGCGG GCGCACGTTC TGTATCGCCG ATTTGAACAT TACCGTGCCG 
301 TCTGAAACGC TTGCCGATGC CAAGGCAAAC AGCCCCCTGT TGTACGGGGA 

3 51 AACTGCTTTG TCGGATATTG TGCGGCAGAA GACGGGCGGC AATGTCGAGT 

4 01 TTAAAGACGG CGTATTGACG GCAGCCGTCC GCTTCCTGCC CGTCAAAGAC 
4 51 GGTCAGACGG CATTTGTCGA CAACACGGTC GGTATGGCGG CGCAAACGCT 
501 . GTCTGCCGCG CTGCTGCCTT ACGGCGTGAA GAGCATCGTG ATGATAGACG 
551 GCAAGGCGGT GAAAAAAGAA GACGCGGTCA GGATTTTGAG CGGAAAAGCC 
601 CGTGAAGAAG AACCGTCCAA ACCCACGCCC GAAGACATTT TGGAACACAA 
651 TGCCGCCGGC GGCGATGCGG GCGTACCCCA AGCCGCAGAA GGCGCGCCCG 
701 AACCGGAAAT CCTGCATCCT GACGACGGCG AGCGTGCCGA TACCGTTACC 
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751 GTATCACGGG GCGAAGTGGA AGAGGCGCGC GTACAAAACC AGCGTGCGGA 

801 ATCCGAAATT ACCAAACTTT GGGGAGGACT CGATACCGAC GTGCAAAAAG 

851 AGTTGGTCGG CGAACAACGC AAGTGGGCGC AGGAAAAAAT CAGCAACTGC 

901 CGACAAGCCG CCGCGCAGGC AGACCGGCAG GAATACGCCG AATACCTCAA 

951 GCTGCAATGC GACACGCGGA TGACGCGCGA ACGGATACAG TATCTTCGCG 

1001 GCTATTCCAT CGATTAG 

This corresponds to the amino acid sequence [<SEQ ID 684; ORF25-l>] (SEP ID NO: 684; 
ORF25-1) : 



1 MYRKLIALPF ALLLAA CGRE EPPKALECAN PAVLQGIRGN IQETLTQEAR 

51 SFAREDGRQF VDADKIIAAA YGLAFSLEHA SETQEGGRTF CIADLNITVP 

101 SETLADAKAN SPLLYGETAL SDIVRQKTGG NVEFKDGVLT AAVRFLPVKD 

151 GQTAFVDNTV GMAAQTLSAA LLPYGVKSIV MIDGKAVKKE DAVRILSGKA 

201 REEEPSKPTP ED I LEHNAAG GDAGVPQAAE GAPEPEILHP DDGERADTVT 

251 VSRGEVEEAR VQNQRAESEI TKLWGGLDTD VQKELVGEQR KWAQEKISNC 

3 01 RQAAAQADRQ EYAEYLKLQC DTRMTRERIQ YLRGYSID* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N.meningitidis (strain A) 

ORF25 (SEP ID NO: 682) shows 98.3% identity over a 60aa overlap with an ORF (ORF25a) 
(SEP ID NO: 686) from strain A of N. meningitidis: 



10 20 30 

orf 25 . pep TDVQKELVGEQRKWAQEKI SNCRQAAAQAD 

MINIM 1 1 I Mill Mill MIMI II 

or f 2 5a VTVSRGEVE EAR VQNQRAESE I TKLWGGLDTDVQKELVGEXRKWAQEKI SNCRQAAAQAD 

250 260 270 280 290 300 



40 ' 50 60 

orf 25 . pep RQE YAE YLKLQCDTRMTRER I QYLRGYS IDX 

I M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 M 

or f 2 5a RQE YAE YLKLQCDTRMTRER I QYLRGYS I DX 

310 320 330 

The complete length PRF25a nucleotide sequence [<SEQ ID 685>] (SEPIDNP: 685) is: 



1 ATGTATCGGA AACTCATTGC GCTGCCGTTT GCCCTGCTGC TTGCCGCTTG 

51 CGGCAGGGAA GAACCGCCCA AGGCATTGGA ATGCGCCAAC CCCGCCGTGT 

101 TGCAANGCAT ACGCNGCAAT ATTCAGGAAA CGCTCACGCA GGAAGCGCGT 

151 TCTTTCGCGC GCGAAGACNG CANGCAGTTT GTCGATGCCG ACNAAATTAT 

2 01 CGCCGCCGCC TANGNTNNGN NGNTNTCTTT GGAACACGCT TCGGAAACGC 

2 51 AGGAAGGCGG GCGCACGTTC TGTNTCGGCG ATTTGAACAT TACCGTGCCG 

3 01 TCTGAAACGC TTGCCGATGC CAAGGCAAAC AGCCCCCTGC TGTACGGGGA 
351 AACCGCTTTG TCGGATATTG TGCGGCAGAA GACGGGCGGC AATGTCGAGT 

4 01 TTAAAGACGG CGTATTGACG GCAGCCGTCC GCTTCCTACC CGTCAAAGAC 
4 51 GGTCAGANGG CATTTGTCGA CAACACGGTC GGTATGGCGG CGCAAACGCT 
501 GTCTGCCGCG TTGCTGCCTT ACGGCGTGAA GAGCATCGTG ATGATAGACG 
551 GCAAGGCGGT AAAAAAAGAA GACGCGGTCA GGATTNTGAG CNGANAAGCC 
601 CGTGAANAAG AACCGTCCAA ANCCNNGCCC GAAGACATTT TGGAACATAA 
651 TGCCGCCGGA GGGGATGCAG ACGTACCCCA AGCCGGAGAA GACGCGCCCG 
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701 AACCGGAAAT CCTGCATCCT GACGACGGCG AGCGTGCCGA TACCGTTACC 

751 GTATCACGGG GCGAAGTGGA AGAGGCGCGN GTACAAAACC AGCGTGCGGA 

801 ATCCGAAATT ACCAAACTTT GGGGAGGACT CGATACCGAC GTGCAAAAAG 

851 AGTTGGTCGG CGAANAACGC AAGTGGGCGC AGGAAAAAAT CAGCAACTGC 

5 901 CGACAAGCCG CCGCGCAGGC AGACCGGCAG GAATACGCCG AATACCTCAA 

951 GCTGCAATGC GACACGCGGA TGACGCGCGA ACGGATACAG TATCTTCGCG 

1001 GCTATTCCAT CGATTAG 

This encodes a protein having amino acid sequence [<SEQ ID 686>] (SEP ID NO: 686) : 



10 1 MYRKLIALPF ALLLAA CGRE EPPKALECAN PAVLQXIRXN IQETLTQEAR 

51 SFAREDXXQF VDADXIIAAA XXXXXSLEHA SETQEGGRTF CXADLNITVP 

101 SETLADAKAN SPLLYGETAL SDIVRQKTGG NVEFKDGVLT AAVRFLPVKD 

151 GQXAFVDNTV GMAAQTLSAA LLPYGVKSIV MIDGKAVKKE DAVRIXSXXA 

201 REXEPSKXXP EDILEHNAAG GDADVPQAGE DAPEPEILHP DDGERADTVT 

15 251 VSRGEVEEAR VQNQRAESEI TKLWGGLDTD VQKELVGEXR KWAQEKISNC 

301 RQAAAQADRQ EYAEYLKLQC DTRMTRERIQ YLRGYSID* 

ORF25a (SEP ID NO: 686) and ORF25-1 (SEP ID NO: 684) show 93.5% identity in 338 aa 
overlap: 



20 * 10 20 30 40 50 60 

orf 2 5a . pep MYRKLIALPFALLLAACGREEPPKALECANPAVLQXIRXNIQETLTQEARS FAREDXXQF 

1 1 1 1 H 1 1 1 1 II M I M M 1 1 1 1 1 i 1 1 1 1 ! U II 1 1 1 1 1 II 1 1 1 1 : 1 1 1 II 

orf 2 5-1 MYRKLIALPFALLLAACGREEPPKALECANPAVLQGIRGNIQETLTQEARSFAREDGRQF 

10 20 30 40 50 60 

25 70 80 90 100 110 120 

or f 2 5a . pep VDADXI IAAAXXXXXSLEHASETQEGGRTFCXADLNITVPSETLADAKANSPLLYGETAL 

. II 1 1 Mill 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 I IIIIMIIIIIIIIIIIIMI III 

orf 25-1 VDADKIIAAAYGLAFSLEHASETQEGGRTFCIADLNITVPSETLADAKANSPLLYGETAL 

70 80 90 100 110 120 

30 130 140 150 160 170 180 

orf 25a . pep SD I VRQKTGGNVEFKDGVLTAAVRFLPVKDGQXAFVDNTVGMAAQTLSAALLPYGVKS IV 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I hi I I I h I II I I I I I I I I I I M Ml I I 
or f 2 5 - 1 SDIVRQKTGGNVEFKDGVLTAAVRFLPVKDGQTAFVDNTVGMAAQTLSAALLPYGVKS IV 

130 140 150 160 170 180 

35 190 200 210 220 230 240 

orf 2 5a. pep M I DGKAVKKEDAVRIXSXXAREXEPS KXXPED I LEHNAAGGDADVPQAGEDAPE PE I LHP 

MM lllllllllll I III I II I : II II I II M I II II 1 1 1 1 : 1 I II I II 1 1 1 

orf 25-1 MIDGKAVKKEDAVRILSGKAREEEPSKPTPEDILEHNAAGGDAGVPQAAEGAPEPEILHP 

190 200 210 220 230 240 

40 250 260 2 70 280 2 90 300 

orf 2 5a . pep DDGERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEXRKWAQEKISNC 

I I I I II I I I I M I I I I I I I I I I I M I I I I I I II I I I II I II II II I I I lllllllllll 
orf 2 5 - 1 DDGERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNC 

250 260 270 280 290 300 

45 310 320 330 339 

orf 25a . pep RQAAAQADRQEYAE YLKLQCDTRMTRER I QYLRGYS I DX 

IMIIMIM MM IIIIMIIIIIIIIIIIIIIII 

orf 25- 1 RQAAAQ ADRQEYAE YLKLQCDTRMTRER I QYLRGYS I DX 

310 320 330 
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Homology with a predicted ORF from N. gonorrhoeae 

ORF25 (SEP ID NO: 682) shows 100% identity over a 60aa overlap with a predicted ORF 
(ORF25ng) (SEP ID NO: 688) from N. gonorrhoeae: 

orf 25 .pep TDVQKELVGEQRKWAQEKISNCRQAAAQAD 3 0 

Illlllllllllllllllllllllllllll 
orf 25ng VTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNCRQAAAQAD 308 

orf 25 .pep RQEYAEYLKLQCDTRMTRERIQYLRGYS ID 60 

i M I M 1 1 1 M M 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 

orf 2 5ng RQEYAEYLKLQCDTRMTRERIQYLRGYS ID 338 

The complete length ORF25ng nucleotide sequence [<SEQ ID 687>] (SEP ID NO: 687) is: 

1 ATGTATCGGA AACTCATTGC GCTGCCGTTT GCCCTGCTGC TTGCAGCGTG 

51 CGGCAGGGAA GAACCGCCCA AGGCGTTGGA ATGCGCCAAC CCCGCCGTGT 

101 TGCAGGACAT ACGCGGCAGT ATTCAGGAAA CGCTCACGCA GGAAGCGCGT 

151 TCTTTCGCGC GCGAAGACGG CAGGCAGTTT GTCGATGCCG ACAAAATTAT 

2 01 CGCCGCCGCC TACGGTTTGG CGTTTTCTTT GGAACACGCT TCGGAAACGC 

2 51 AGGAAGGCGG GCGCACGTTC TGTATCGCCG ATTTGAACAT TACCGTGCCG 

3 01 TCTGAAACGC TTGCCGATGC CGAGGCAAAC AGCCCCCTGC TGTATGGGGA 
351 AACGTCTTTG GCAGACATCG TGCAGCAGAA GACGGGCGGC AATGTCGAGT 

4 01 TTAAAGACGG CGTATTGACG GCAGCCGTCC GCTTCCTGCC CGCCAAAGAC 
4 51 GCTCGGACGG CATTTATCGA CAACACGGTC GGTATGGCGA CGCAAACGCT 
501 GTCTGCCGCG TTGCTGCCTT ACGGCGTGAA GAGCATCGTG ATGATAGACG 
551 GCAAGGCGGT GACAAAAGAA GACGCGGTCA GGGTTTTGAG CGGCAAAGCC 
601 CGTGAAGAAG AACCGTCCAA ACCCACCCCC GAAGACATTT TGGAACACAA 
651 TGCCGCCGGC GGCGATGCGG GCGTACCCCA AGCCGCAGAA GGCGCACCCG 
701 AACCCGAAAT CCTGCATCCC GACGACGTCG AGCGTGCCGA TACCGTTACC 
751 GTATCACGGG GCGAAGTGGA AGAGGCGCGC GTACAAAACC AACGTGCGGA 
801 ATCCGAAATT ACCAAACTTT GGGGAGGACT CGATACCGAC GTGCAAAAAG 
851 AGTTGGTCGG CGAACAGCGC AAGTGGGCGC AGGAAAAAAT CAGcaactgc 
901 ■ cgACAAGCCG CCGCGCAGGC AGACCGGCAG GAATACGCCG AATACCTCAA 
951 GCTCCAATGC GACACGCGGA TGACGCGCGA ACggaTACAG TATCTTCGCG 

1001 GCTATTCCAT CGATTAG 

This encodes a protein having amino acid sequence [<SEQ ID 688>] (SEP ID NP: 688) : 



1 MYRKLIALPF ALLLAA CGRE EPPKALECAN PAVLQDIRGS IQETLTQEAR 

51 SFAREDGRQF VDADKIIAAA YGLAFSLEHA SETQEGGRTF CIADLNITVP 

101 SETLADAEAN SPLLYGETSL ADIVQQKTGG NVEFKDGVLT AAVRFLPAKD 

151 ARTAFIDNTV GMATQTLSAA LLPYGVKSIV MIDGKAVTKE DAVRVLSGKA 

201 REEEPSKPTP EDILEHNAAG GDAGVPQAAE GAPEPEILHP DDVERADTVT 

251 VSRGEVEEAR VQNQRAESEI TKLWGGLDTD VQKELVGEQR KWAQEKISNC 

301 RQAAAQADRQ EYAEYLKLQC DTRMTRERIQ YLRGYSID* 

PRF25ng (SEP ID NP: 688) and PRF25-1 (SEP ID NP: 684) show 95.9% identity in 338 aa 
overlap: 



10 20 30 40 50 60 

orf 25-1 .pep MYRKLIALPFALLLAACGREEPPKALECANPAVLQGIRGNIQETLTQEARSFAREDGRQF 

1 1 1 i 1 1 1 [ 1 1 1 1 1 1 1 E 1 1 1 1 ! 1 1 1 1 1 11 I i h 1 1 1 1 1 1 1 1 1 1 1 i i 1 1 i 1 1 1 
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orf 2 5ng 



MYRKL I ALP FALLLAACGREE PPKALECANPAVLQD I RGS IQETLTQEARS FAREDGRQF 
10 20 30 40 50 60 



10 



15 



20 



25 



70 80 90 100 110 120 

or f 2 5 - 1 . pep VDADKI IAAAYGLAFSLEHASETQEGGRTFCIADIiNITVPSETLADAKANSPLLYGETAL 
I ! I I I I II I I I I I I I I I I I I I I I I I I I Ml I I I I I i I I I I M I Ih I U I I I I I M 
orf25ng VDADKIIAAAYGLAFSLEHASETQEGGRTFCIADLNITVPSETLADAEANSPLLYGETSL 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 25 - 1 . pep SDIVRQKTGGNVEFKDGVLTAAVRFLPVKDGQTAFVDNTVGMAAQTLSAALLPYGVKSIV 

: I I h I I I I I I I I I I I I I I I I II I I I I • I I - I II ' II I I I I I - I I I I I I I I I I I I I I I I 
orf25ng ADIVQQKTGGNVEFKDGVLTAAVRFLPAKDARTAFIDNTVGMATQTLSAALLPYGVKSIV 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 25-1 .pep MIDGKAVKKEDAVRILSGKAREEEPSKPTPEDILEHNAAGGDAGVPQAAEGAPEPEILHP 

Illllll i 1 1 1 i M I M 1 1 1 1 I U 1 1 1 II I M I M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf25ng MIDGKAVTKEDAVRVLSGKAREEEPSKPTPEDILEHNAAGGDAGVPQAAEGAPEPEILHP 

190 1 200 210 220 230 240 

250 260 270 280 290 300 

orf 25 - 1 . pep DDGERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNC 

II IIIIIMIIIIIIIilllll I IIIMIMMMIMIIMI IMIMMIIIM 

orf25ng DDVERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNC 

250 260 270 280 290 300 

310 320 330 339 

orf 25- 1 . pep RQAAAQADRQE YAEYLKLQCDTRMTRER I QYLRGYS IDX 

I I M 1 1 1 1 1 II M M I M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf25ng RQAAAQADRQE YAEYLKLQCDTRMTRER I QYLRGYS IDX 

310 320- 330 



Based on this analysis, including the presence of a predicted prokaryotic membrane lipoprotein 
lipid attchment site (underlined) in the gonococcal protein, it was predicted that the proteins from 
30 N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or 
diagnostics, or for raising antibodies. 

ORF25-1 (SEP ID NO: 684) (37kDa) was cloned in pET and pGex vectors and expressed in 
E.colU as described above. The products of protein expression and purification were analyzed by 
SDS-PAGE. Figure 16A shows the results of affinity purification of the GST-fusion protein, and 
35 Figure 16B shows the results of expression of the His-fusion in ExolL Purified His-fusion protein 
was used to immunise mice, whose sera were used for Western blot (Figure 1 6C), ELIS A (positive 
result), and FACS analysis (Figure 16D). These experiments confirm that ORF25-1 (SEP ID NO: 
684) is a surface-exposed protein, and that it is a useful immunogen. 
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Figure 16E shows plots of hydrophilicity, antigenic index, and AMPHI regions for ORF25-1 (SEO 
ID NO: 684) . 



Example 82 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 689>] (SEO ID 
NO: 689) 



1 ATGCAGCTGA TCGACTATTC ACATTCATTT TTCTCGGTTG TGCCACCCTT 

51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG 

101 GCATCGGTAT TCTGGwysGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC 

151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGTCAGA 

201 CGsyGATTGG TCGCTGGGCA AACCAAAAAT CTTGGTTTTC CkGATACTTT 

251 TGGGTATTTT TACTTCCCTG CTGACCTACT CCGGCAGCAA T 

// 

851 AC TTCGCTGGTA 

901 TTCGGCGGCA CTTGCGGCGT CTTTGCCGTC GTTCTCTGCA CGCTCGGCAC 

951 GATTAAAACC GCCGACTATC CCAAAGCCGT TTGGCAGGGT GCGAAATCTA 

1001 TGTTCGGCGC AATCGCCATT TTAATCCTCG CTTGGCTCAT CAGTACGGTT 

1051 GTCGGCGAAA TGCACACCGG CGATTACCTC TCCACACTGG TTGCGGGCAA 

1101 CATCCATCCC GGCTTCCTGC CCGTCATCCT CTTCCTGCTC GCCAGCGTGA 

1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT TATGCTGCCG 

1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAA CCCGCGCTGA TTATCCCGTG 

1251 TATGTCCGCA GTAATGGCGG GGGCGGTATG CGGCGACCAC TGCTCGCCCA 

13 01 TTTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC 

1351 GACCACGTTA CCTCGCAACT GCCTTACGCC V TTAACCGTTG CCGCCGCCGC 

1401 CGCATCGGGC TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGCT 

1451 TTGGCACGAC AGGCATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT 

1501 AAAAAA. . 



This corresponds to the amino acid sequence [<SEQ ID 690; ORF26>] (SEO ID NO: 690; 
ORF26) : 



1 MQLIDYSHSF FSWPPFLAL ALAVITRRVL LSLGIGILXX VAFLVGGNPV 

51 DGLTHLKDMV VGLAWSDXDW SLGKPKILVF XILLGIFTSL LTYSGSN. . . 

// 

251 TSLV 

301 FGGTCGVFAV VLCTLGTIKT ADYPKAVWQG AKSMFGAIAI LILAWLISTV 

351 VGEMHTGDYL STLVAGNIHP GFLPVILFLL ASVMAFATGT SWGTFGIMLP 

401 IAAAMAVKVE PALIIPCMSA VMAGAVCGDH CSPISDTTIL SSTGARCNHI 

451 DHVTSQLPYA LTVAAAAASG YLALGLTKSA LLGFGTTGIV LAVLIFLLKD 

501 KK. . 



Further work revealed the complete nucleotide sequence [<SEQ ID 69 1>] (SEO ID NO: 691) : 



1 ATGCAGCTGA TCGACTATTC ACATTCATTT TTCTCGGTTG TGCCACCCTT 

51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG 

101 GCATCGGTAT TCTGGTCGGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC 

151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGTCAGA 

201 CGGCGATTGG TCGCTGGGCA AACCAAAAAT CTTGGTTTTC CTGATACTTT 

251 TGGGTATTTT TACTTCCCTG CTGACCTACT CCGGCAGCAA TCAGGCGTTT 
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301 GCCGACTGGG CAAAACGGCA CATTAAAAAC CGGCGCGGCG CGAAAATGCT 

351 GACCGCCTGC CTCGTGTTCG TAACCTTTAT CGACGACTAT TTCCACAGTC 

4 01 TCGCCGTCGG TGCGATTGCC CGCCCCGTTA CCGACAAGTT TAAAGTTTCC 

451 CGCACCAAAC TCGCCTACAT CCTCGACTCC ACTGCCGCTC CTATGTGCGT 

501 GCTGATGCCC GTTTCAAGCT GGGGCGCGTC GATTATCGCC ACGCTTGCCG 

551 GACTGCTCGT TACCTACAAA ATCACCGAAT ACACGCCGAT GGGGACGTTT 

601 GTCGCCATGA GCCTGATGAA CTATTACGCA CTGTTTGCCC TGATTATGGT 

651 GTTCGTCGTC GCATGGTTTT CCTTCGACAT CGGCTCGATG GCACGTTTCG 

701 AACAAGCCGC GTTGAACGAA GCCCACGATG AAACTGCCGT TTCAGACGCT 

751 ACCAAAGGTC GTGTTTACGC ACTGATTATT CCCGTTTTGG CCTTAATCGC 

801 CTCAACGGTT TCCGCCATGA TCTACACCGG CGCGCAGGCA AGCGAAACCT 

851 TCAGCATTTT GGGGGCATTT GAAAACACGG ACGTAAACAC TTCGCTGGTA 

901 TTCGGCGGCA CTTGCGGCGT CCTTGCCGTC GTTCTCTGCA CGCTCGGCAC 

951 GATTAAAACC GCCGACTATC CCAAAGCCGT TTGGCAGGGT GCGAAATCTA 

1001 TGTTCGGCGC AATCGCCATT TTAATCCTCG CTTGGCTCAT CAGTACGGTT 

1051 GTCGGCGAAA TGCACACCGG CGATTACCTC TCCACACTGG TTGCGGGCAA 

1101 CATCCATCCC GGCTTCCTGC CCGTCATCCT CTTCCTGCTC GCCAGCGTGA 

1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT TATGCTGCCG 

12 01 ATTGCCGCCG CCATGGCGGT CAAAGTCGAA CCCGCGCTGA TTATCCCGTG 

12 51 TATGTCCGCA GTAATGGCGG GGGCGGTATG CGGCGACCAC TGCTCGCCCA 

13 01 TTTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC 
1351 GACCACGTTA CCTCGCAACT GCCTTACGCC TTAACCGTTG CCGCCGCCGC 

14 01 CGCATCGGGC TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGCT 
1451 TTGGCACGAC AGGCATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT 
1501 AAAAAACGCG CCAACGCCTG A 



This corresponds to the amino acid sequence [<SEQ ID 692; ORF26-l>] (SEP ID NO: 692; 
PRF26-1) : 

1 MQLIDYSHSF FSWPPFLAL ALAVITRRVL LSLGIGILVG VAFLVGGNPV 
51 DGLTHLKDMV VGLAWSDGDW SLGKP KILVF LILLGIFTSL LTY SGSNQAF 
101 ADWAKRHIKN R RGAKMLTAC LVFVTFID DY FHSLAVGAIA RPVTDKFKVS 
151 RTKLAYILDS TAAPMCVLMP VSSWGASIIA TLAGLLV TYK ITEYTPMGTF 
2 01 VAMSLMNYYA LFALIMVFW AWFSFDI GSM ARFEQAALNE AHDETAVSDA 

2 51 TKGRVY ALII PVLALIASTV SAMI YTGAQA SETFSILGAF ENTDVNTSLV 

3 01 FGGTCGVLAV VLCTL GT I KT ADYPKAVWQG AKSM FGAIAI LILAWLISTV 
3 51 VGEMHTGDYL STLVAGNIHP GFLPVILFLL ASVMAFA TGT SW GTFGIMLP 
401 IAAAMAVKV E P ALIIPCMSA VMAGAVCG DH CSPISDTTIL SSTGARCNHI 
451 DHVTSQLPYA LTVAAAAASG YLALGLTKSA LLGFGTTGIV LAVLIFLLKD 
501 KKRANA* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with the hypothetical transmembrane protein HI1586 (SEP ID NO: 1156) of 
H. influenzae (accession number P44263) 

PRF26 (SEP ID NP: 690) and HI1586 (SEP ID NP: 1156) show 53% and 49% amino acid 
identity in 97 and 221 aa overlap at the N-terminus and C-terminus, respectively: 



Orf26 1 ' MQL IDYSHS FFS WP PFLALALAV I TRRVXXXXXXXXXXXVAFLVGGNPVDGLTHLKDMV 60 

M+LID+S S +S+VP LA+ LA+ TRRV L +L V 

HI1586 14 MELIDFSSSVWS IVPALLAI ILAIATRRVLVSLSAGI I IGSLMLSDWQIGSAFNYLVKNV 73 



CHIR-0160 (356.001) 



-511- 



PATENT 



Orf26 61 VGLAWSDXDWS LGKPKI LVFX I LLG I FTS LLT YSGSN 97 

V L ++D + + I++F +LLG+ T+LLT SGSN 

HI1586 74 VSLVYADGEIN-SNMNIVLFLLLLGVLTALLTVSGSN 109 



// 



5 Orf26 86 IFTSLLTYSGS- -NTSLVFGGTCGVFAWLCTL- -GTIKTADYPKAWQGAKSMFGXXXX 141 

+ F+ L T+ + TSLV GG C + L + + +Y + + G KSM G 
HI1586 299 VFSVLGTFENTVVGTSLVVGGFCSI I ISTLLI ILDRQVSVPEYVRSWIVGIKSMSGAIAI 358 

Orf26 142 XXXXXXXSTWGEMHTGDYLSTLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLP 201 
+ +VG+M TG YLS+LV+GNI FLPVILF+L + MAF+TGTSWGTFGIMLP 
10 HI1586 359 LFFAWTINKIVGDMQTGKYLSSLVSGNIPMQFLPVILFVLGAAMAFSTGTSWGTFGIMLP 418 



Orf26 202 I AAAMAVKVEPAL 1 1 PCMS AVMAGAVCGDHCS P I SDTT I LS STGARCNH IDHVTSQXXXX 261 

IAAAMA P L++PC+SAVMAGAVCGDHCSP+SDTTILSSTGA+CNHIDHVT+Q 
HI1586 419 IAAAMAANAAPELLLPCLSAVMAGAVCGDHCSPVSDTTILSSTGAKCNHIDHVTTQLPYA 478 

Orf26 262 XXXXXXXXXXXXXXXXXKSALLGFGTTGIVLAVLIFLLKDK 302 
15 S L GF T + L V+IF +K + 

HI 1586 479 ATVATATS IGYI WGFTYSGLAGFAATAVSLIVI I FAVKKR 519 

Homology with a predicted ORF from N. meningitidis (strain A) 

ORF26 (SEP ID NO: 690) shows 58.2% identity over a 502aa overlap with an ORF (ORF26a) 
(SEP ID NO: 694) from strain A of N. meningitidis: 

20 10 20 30 40 50 60 

orf 26 . pep MQLIDYSHSFFSWPPFLALALAVITRRVLLSLGIGILXXVAFLVGGNPVDGLTHLKDMV 

1 1 U 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 M 1 : 1 1 i M 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 

orf 26a MQLIDYSHSFFSWPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV 

10 20 30 40 50 60 

25 70 80 90 99 

orf 26 . pep VGLAWS DXDWS LGKP KI LVFX I LLG I FTS LLT Y SGSNXX 

Illllll IIIIIMI III I I I I i I M M II I I I 
orf 26a VGLAWSDGDWSLGKP KXLVFL I LLG I FTS LLT Y SGSNQAFADWAKRH I KN RRGAKMLTAC 

70 80 90 100 110 120 



30 

orf 2 6 .pep 



orf 26a LVFVTFID DYFHSLAVGAXARPVTDKFKVSRAKLAYILDSTAAPMCVLMP VSSWGASIIA 

130 140 150 160 170 180 

35 

orf 26. pep 

orf 26a TLAGLLV TYKITEYTPMGTFVAMSLMNYY ALFALIMVFWAWFS FDI GSMARFEQAALME 

190 200 210 220 230 240 

40 ioo no 

orf 26. pep TSLV 

I I I I 

orf 26a AHDETAVSDGS WGRVYA L 1 1 P VLAL I AS TVS AM I YTGAQAS ET FS I LGAFENTDVNTS LV 

250 260 270 280 290 300 
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120 130 140 150 160 170 

orf 26 . pep FGGTCG VF A WLCTL GT I KT AD Y P KAVWQGAKSM FG A I A I L I LAWL I S T W GEMHTGD YL 
llllllhlllllllllll M I I I I I I I I M I I I I I I I I I I I II I I I M I I I I I I I I I 
or f 2 6 a FGGTCGVLA WLCTL GT I KI AD Y P KAVWQGAKSM FG A I A I L I LAWL I S T W GEMHTGD YL 

310 320 • 330 340 350 360 

180 190 200 210 220 230 

orf 26 .pep STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA 

II I II III Mill I III II INI II II MM INI II III II I llllhhl I II III I 

orf 26a STLVAGNIHPGFLXVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVDPSLI IPCMSA 

370 380 390 400 410 420 

240 250 260 270 280 290 

orf 26 .pep VMAGAVCG DHCS PISDTTILSS TGARCNH I DHVTSQLP Y ALTVAAAAASG YLALGL TKS A 

II 1 1 1 1 1 1 1 1 Ml 1 1 1 M 1 1 M 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 

orf 26a VMAGAVCG DHCSPISDTTILSSTGARCNHIDHVTSQLPY ALTVAAAAASGYLALGL TKSA 
430 440 450 460 470 480 

300 310 
orf26.pep LLGFGTTGIVLAVLIFL LKDKK 

MllhllMMIMIMMM 
orf26a LLGFGXTGIVLAVLIFL LKDKKRANAX 

490 500 

The complete length ORF26a nucleotide sequence [<SEQ ID 693>] (SEP ID NO: 693) is 



1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 



ATGCAGCTGA 
TTTGGCACTG 
GCATCGGTAT 
GACGGTCTGA 
CGGCGATTGG 
TGGGTATTTT 
GCCGACTGGG 
GACCGCCTGC 
TCGCCGTCGG 
CGCGCCAAAC 
GCTGATGCCC 
GACTGCTCGT 
GTCGCCATGA 
GTTCGTCGTC 
AACAAGCCGC 
AGCTGGGGCA 
CTCAACGGTT 
TCAGCATTTT 
TTCGGCGGCA 
GATTAAAATC 
TGTTCGGCGC 
GTCGGCGAAA 
CATCCATCCC 
TGGCGTTTGC 
ATTGCCGCCG 
TATGTCCGCC 
TTTCCGACAC 
GACCACGTTA 
CGCATCGGGN 
TTGGCANGAC 
AAAAAACGCG 



TCGACTATTC 
GCACTTGCCG 
TCTGGTCGGC 
CACACCTGAA 
TCGCTGGGCA 
TACTTCCCTG 
CAAAACGGCA 
CTCGTGTTCG 
TGCGNTTGCC 
TCGCCTACAT 
GTTTCAAGCT 
TACCTACAAA 
GCCTGATGAA 
GCATGGTTCT 
GTTGAACGAA 
GGGTTTACGC 
TCCGCCATGA 
GGGTGCATTT 
CTTGCGGCGT 
GCCGATTATC 
AATCGCCATT 
TGCACACAGG 
GGCTTCCTGN 
CACAGGCACA 
CCATGGCGGT 
GTGATGGCGG 
GACCATCCTG 
CNTCGCAACT 
TACCTCGCAT 
AGGCATTGTA 
CCAACGCCTG 



ACATTCATTT 
TCATTACCCG 
GTTGCCTTTT 
AGACATGGTC 
AACCAAAANT 
CTGACCTACT 
CATTAAAAAC 
TAACCTTTAT 
CGCCCCGTTA 
CCTCGACTCC 
GGGGCGCGTC 
ATCACCGAAT 
CTATTACGCA 
CCTTCGACAT 
GCCCACGATG 
ATTGATTATT 
TCTACACCGG 
GAAAATACGG 
GCTTGCCGTC 
CCAAAGCCGT 
TTAATCCTTG 
CGACTACCTC 
CCGTCATCCT 
AGCTGGGGGA 
CAAAGTCGAT 
GGGCGGTATG 
TCGTCCACCG 
GCCTTACGCC 
TGGGTCTGAC 
TTGGCGGTGC 
A 



TTCTCGGTTG 
CCGCGTACTG 
TGGTCGGCGG 
GTCGGCTTGG 
CTTGGTTTTC 
CCGGCAGCAA 
CGGCGCGGCG 
CGACGACTAT 
CCGACAAGTT 
ACTGCCGCGC 
GATTATCGCC 
ACACGCCGAT 
CTGTTTGCCC 
CGGCTCGATG 
AAACTGCCGT 
CCCGTTTTGG 
TGCACAGGCA 
ACGTGAACAC 
GTCCTCTGCA 
TTGGCAGGGT 
CCTGGCTCAT 
TCCACGCTGG 
TTTCCTGCTC 
CGTTCGGCAT 
CCCTCACTGA 
CGGCGACCAC 
GCGCGCGCTG 
TTAACCGTTG 
AAAATCCGCG 
TGATTTTTCT 



TGCCACCCTT 
CTGTCTTTAG 
CAACCCCGTC 
CTTGGTCAGA 
CTGATACTTT 
TCAGGCGTTT 
CGAAAATGCT 
TTCCACAGTC 
TAAAGTTTCC 
CTATGTGCGT 
ACGCTTGCCG 
GGGGACGTTT 
TGATTATGGT 
GCACGTTTCG 
TTCAGACGGC 
CCTTAATCGC 
AGCGAAACCT 
TTCGCTGGTA 
CGCTCGGCAC 
GCGAAATCCA 
CAGTACGGTT 
TTGCGGGCAA 
GCCAGCGTGA 
CATGCTGCCG 
TTATCCCGTG 
TGCTCGCCCA 
CAACCACATC 
CCGCCGCCGC 
CTGTTGGGTT 
GTTGAAAGAT 



This encodes a protein having amino acid sequence [<SEQ ID 694>] (SEP ID NO: 694) : 
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10 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 



MQLIDYSHSF FSWPPFLAL ALAVITRRVL LSLGIGILVG 
DGLTHLKDMV VGLAWSDGDW SLGKP KXLVF LILLGIFTSL 
ADWAKRHIKN R RGAKMLTAC LVFVTFID DY FHSLAVGAXA 
RAKLAYILDS TAAPMCVLMP VSSWGASIIA TLAGLLV TYK 
VAMSLMNYYA LFALIMVFW AWFSFDI GSM ARFEQAALNE 
SWGRVY ALII PVLALIASTV SAMI YTGAQA SETFSILGAF 
FGGTCGVLAV VLCTL GTIKI ADYPKAVWQG AKSM FGAIAI 
VGEMHTGDYL 
IAAAMAVKVD 



VAFLVGGNPV 
LTY SGSNQAF 
RPVTDKFKVS 
ITEYTPMGTF 
AHDETAVSDG 
ENTDVNTSLV 
LILAWLISTV 



STLVAGNIHP GFLXVILFLL ASVMAFATGT SWGTFGIMLP 



DHVTSQLPYA 
KKRANA* 



P SLIIPCMSA VMAGAVCG DH CSPISDTTIL SSTGARCNHI 
LTVAAAAASG YLALGLTKSA LLGFGXTGIV LAVL I FLLKD 



ORF26a (SEP ID NO: 694) and ORF26-1 (SEP ID NO: 692) show 97.8% identity in 506 aa 
overlap: 



15 



10 20 30 40 50 60 

orf 26a . pep MQLIDYSHSFFSWPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV 

II I II 1 1 1 1 MM I ' 1 1 1 1 1 1 1 1 1 II 1 1 1 i 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 

orf 26 - 1 MQLIDYSHSFFSWPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV 

10 20 30 40 50 60 



20 



70 80 90 100 110 120 

orf 26a . pep VGLAWSDGDWS LGKPKXLVFL I LLG I FTS LLTYSGSNQAFADWAKRH I KNRRGAKMLTAC 

II I Mill MM I IN Mill II MM II II MM II II II II II II MM II! 

orf 26 - 1 VGLAWSDGDWSLGKPKI LVFL I LLG I FTS LLTYSGSNQAFADWAKRH I KNRRGAKMLTAC 

70 80 90 100 110 120 



25 



130 140 150 160 170 180 

orf 26a. pep LVFVT F I DDYFHS LA VGAXARPVTDKFKVS RAKLAYILDS TAAPMCVLMP VSSWGASIIA 

Mlllill MM III III 1 1 1 J 1 1 i 1 1 1 1 1 : 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 26-1 LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRTKLAYILDSTAAPMCVLMPVSSWGASIIA 

130 140 150 160 170 180 



30 



190 200 210 220 230 240 

orf 26a . pep TLAGLLVT YKI TE YTPMGTFVAMS LMNY YALFAL IMVFWAWFS FD I GSMARFEQAALNE 

II M II M II I I Mill MIMI I II II III II II II III II MM III 1 1 1 1 1 III 

orf 26 - 1 T LAGLLVT YKI TE YTPMGTFVAMS LMNY YALFAL I MVFWAWFSFD I GSMARFEQAALNE 

190 200 210 220 230 240 



35 



250 . 260 270 280 290 300 

orf 26a. pep AHDETAVSDGSWGRVYAL 1 1 P VLAL I ASTVS AM I YTGAQAS ETFS I LGAFENTDVNTS LV 

I I I I 1 III:: | || I I I I I I I I I II II II I M I I I I II I II II I I M II I II I I I II M 
orf26-l AHDETAVSDATKGRVYALI I PVLALIASTVSAMI YTGAQASETFSILGAFENTDVNTSLV 

250 260 270 280 290 300 



40 



45 



310 320 330 340 350 360 

orf 26a . pep FGGTCGVLAWLCTLGT I KI ADYPKAVWQGAKSMFGAI AI LI LAWLI STWGEMHTGDYL 

1 1 i I M 1 1 1 II I II 1 1 1 1 1 1 N 1 1 1 M i 1 1 1 1 1 1 1 ! 1 1 1 1 1 Ill II 

or f 2 6 - 1 FGGTCGVLAWLCTLGT I KTADYPKAVWQGAKSMFGAI AI L I LAWL I STWGEMHTGDYL 

310 320 330 340 350 360 

370 380 390 400 410 420 

orf 26a . pep STLVAGNIHPGFLXVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVDPSLIIPCMSA 

IMMMIIMM MMMMMIIMM IMIMM IIIIIIIMMMIIIIMI 

orf26-l STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA 

370' 380 390 400 410 420 



50 
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430 440 450 460 470 480 

orf 26a . pep VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA 
I I I I II II II II II I I III I I I I I I I III I II UNI I I I I I I I I I I II II I Ml I I 
orf26-l VMAGAVCGDHCS PI SDTTI LSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKS A 

430 440 450 460 470 480 

490 500 
orf 26a. pep LLGFGXTGI VLAVLI FLLKDKKRANAX 

I I I I hi I I I I I M I II I I I I I I II 
orf26-l LLGFGTTGIVLAVLIFLLKDKKRANAX 

490 500 

Homology with a predicted ORF from N. gonorrhoeae 

ORF26 (SEP ID NO: 690) shows 94.8% and 99% identity in 97 and 206 aa overlap at the N- 
terminus and C-terminus, respectively, with a predicted ORF (ORF26ng) (SEP ID NO: 696) from 
N. gonorrhoeae: 



15 



20 



orf 26 . pep MQLIDYSHSFFSWPPFLALALAVITRRVLLSLGIGILXXVAFLVGGNPVDGLTHLKDMV 60 

II I I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I i I I I I I I I I I I 

orf 26ng MQLIDYSHSFFSWPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV 60 

orf 26 .pep VGLAWSDXDWSLGKPKILVFXILLGIFTSLLTYSGSN 97 

Illlhl llllllllllll llllllllllllllll 

or f 2 6 ng VGLAWADGDWS LGKPKI LVFLI LLG I FTS LLTYS GSNQAFADWAKRH I KNRCGAKMLTAC 12 0 

// 

orf26 pep TSLVFGGTCGVFAWLCTLGTIKTADYPKA 326 

IIIIIIIMIhMlllhlllllllllll 

orf2 6ng ASTVSAMIYTGAQASETFSILGAFENTDVNTSLVFGGTCGVLAWLCTFGTIKTADYPKA 326 

orf 26 .pep VWQGAKSMFGAIAILILAWLISTWGEMHTGDYLSTLVAGNIHPGFLPVILFLLASVMAF 3 86 

I I I I I I I I I I I I I I I I I I h I I I I I I h II I I I I I II I I I I I I I I I I II I I I h I h I I 

orf26ng VWQGAKSMFGAIAILILAWLISTWGEMHTGDYLSTLVAGNIHPGFLPVILFLLASVMAF 386 

orf 26 .pep ATGTSWGTFG IMLP I AAAMAVKVEPALI I PCMSAVMAGAVCGDHCS P I SDTTI LSSTGAR 446 

Ml Mil III IIIMMIIIIIMMIIIMIIIIIIIMIMMI MM IIIIMI IN 

orf26ng ATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSAVMAGAVCGDHCSPISDTTILSSTGAR 446 

orf 26. pep CNHIDHVTSQLPYALTVAAAAASGYLALGLTKSALLGFGTTGIVLAVLIFLLKDKK 502 

Ml IIIIMI MM IIIIMI III 1 1 II III! Mil III II II I MM II MM II 

orf26ng CNHIDHVTSQLPYALTVAAAAASGYLALGLTKSALLGFGTTGIVLAVLIFLLKDKKRADV 506 

35 The complete length ORF26ng nucleotide sequence [<SEQ ED 695>] (SEP ID NO: 695) is: 



25 



30 



40 



1 


ATGCAGCTGA 


51 


TTTGGCACTG 


101 


GCATCGGTAT 


151 


GACGGTCTGA 


201 


CGGCGATTGG 


251 


TGGGCATTTT 


301 


GCCGACTGGG 


351 


GACCGCCTGC 



TTGACTATTC 
GCACTTGCCG 
TTTGGTCGGC 
CACACCTGAA 
TCGCTGGGCA 
CACTTCACTG 
CAAAACGGCA 
CTCGTGTTCG 



ACATTCATTT 
TCATTACCCG 
GTTGCCTTTT 
AGACATGGTC 
AACCAAAAAT 
CTGACCTACT 
CATTAAAAAC 
TAACCTTTAT 



TTCTCGGTTG 
CCGCGTACTG 
TGGTCGGCGG 
GTCGGCTTGG 
CTTGGTTTTC 
CCGGCAGCAA 
CGGTGCGGCG 
CGACGACTAT 



TGCCACCCTT 
CTGTCTTTAG 
CAACCCCGTC 
CTTGGGCAGA 
CTGATACTTT 
TCAGGCGTTT 
CGAAAATGCT 
TTCCACAGCC 



-515- 
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4 01 TCGCCGTCGG TGCGATTGCC CGCCCCGTTA CCGACAAGTT TAAAGTTTCC 

4 51 CGCGCCAAAC TCGCCTACAT CCTCGACTCC ACTGCCTCGC CCATGTGCGT 

501 GCTGATGCCC GTTTCAAGCT GGGGCGCGTC GATTATCGCC ACGCTTGCCG 

551 GATTGCTCGT TACCTACAAA ATTACCGAAT ACACGCCGAT GGGGACGTTT 

5 601 GTCGCCATGA GCCTGATGAA CTATTACGCG CTGTTTGCCC TGATTATGGT 

651 ATTCGTCGTC GCATGGTTCT CCTTCGACAT CGGCTCGAtg gCGCGTTTCG 

701 AACAGGCTGC GTTGAACGAA gcccaggacg aaaccgccgc tTCAGACgCT 

751 ACCAAAGGTC GTGTTTACGC ATTGATTATT CCCGTTTTGG CCTTAATCGC 

801 CTCAACGGTT TCCGCCATGA TCTACACCGG CGCGCAGGCA AGCGAAACCT 

10 851 TCAGCATTTT GGGGGCATTT GAAAATACCG ACGTAAACAC TTCGCTGGTA 

901 TTCGGCGGCA CTTGCGGCGT GCTTGCCGTC GTCCTCTGCA CGTTCGGCAC 

951 GATTAAAACC GCCGATTATC CCAAAGCCGT GTGGCAGGGT GCGAAATCCA 

1001 TGTTCGGCGC AATCGCCATT TTAATCCTCG CCTGGCTCAT CAGTACGGTT 

1051 GTCGGCGAAA TGCACACGGG CGACTACCTC TCCACGCTGG TTGCGGGCAA 

15 1101 CATCCATCCC GGCTTCCTGC CCGTCATCCT CTTCCTGCTC GCCAGCGTGA 

1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT TATGCTGCCG 

1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAA CCCGCGCTGA TTAtcccGTG 

1251 TATGTCCGCA GTAATGGCGG GGGCGGTATG CGGCGACCAC TGTTCGCCCA 

13 01 TCTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC 
20 13 51 GACCACGTTA CCTCGCAACT GCCTTATGCC CTGACGGTTG CCGCCGCCGC 

14 01 CGCATCGGGC TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGCT 
14 51 TTGGCACGAC CGGTATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT 
1501 AAAAAACGCG CCGACGTTTG A 

25 This encodes a protein having amino acid sequence [<SEQ ID 696>] (SEP ID NO: 696) : 

1 MQLIDYSHSF FSWPPFLAL ALAVITRRVL LSLGIGILVG VAFLVGGNPV 

51 DGLTHLKDMV VGLAWADGDW SLGKP KILVF LILLGIFTSL LTY SGSNQAF 

101 ADWAKRHIKN R CGAKMLTAC LVFVTFID DY FHSLAVGAIA RPVTDKFKVS 

151 RAKLAYILDS TASPMCVLMP VSSWGASIIA TLAGLLV TYK ITEYTPMGTF 

30 201 VAMSLMNYYA LFALIMVFW AWFSFDI GSM ARFEQAALNE AQDETAASDA 

251 TKGRVYA LII PVLALIASTV SAMI YTGAQA SETFSILGAF ENTDVNTSLV 

3 01 FGGTCGVLAV VLCTF GTIKT ADYPKAVWQG AKSM FGAIAI LILAWLISTV 

3 51 VGEMHTGDYL STLVAGNIHP GFLPVILFLL ASVMAFA TGT SW GTFGIMLP 

401 IAAAMAVKV E P ALIIPCMSA VMAGAVCG DH CSPISDTTIL SSTGARCNHI 

35 451 DHVTSQLPYA LTVAAAAASG YLALGLTKSA LLGFGTTGIV LAVLIFLLKD 

501 KKRADV* 

ORF26ng (SEP ID NO: 696) and ORF26-1 (SEP ID NO: 692) show 98.4% identity in 505 aa 
overlap: 



40 



10 20 30 40 50 60 

orf26-l .pep MQLIDYSHSFFSWPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV 

1 1 1 1 1 1 ■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 i 1 1 i I M M M ! 1 1 1 i 1 1 1 1 ! M 1 1 1 

orf26ng MQL IDYSHS FFS WPP FLALALAVI TRRVLLS LG I G I LVGVAFLVGGNP VDGLTHLKDMV 

10 20 30 40 50 60 



45 



70 80 90 100 110 120 

orf26-l .pep VGLAWSDGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRRGAKMLTAC 

INI: lllillilllMlllllllMIIMIIIIII I MINIUM IIIIIIM 

orf26ng VGLAWADGDWS LGKP KI LVFL I LLG I FTSLLTYSGSNQAFADWAKRH I KNRCGAKMLTAC 

70 80 90 100 110 120 



50 



130 140 150 160 170 180 

or f 2 6 - 1 . pep LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRTKLAYILDSTAAPMCVLMPVSSWGASI IA 

MMI MMMMMIMIMM MMMMIMMI MMIIMIMMMMM 

orf26ng LVFVT F I DDYFHSLAVGAIARPVTDKFKVS RAKLAYILDS TASPMCVLMP VSSWGASIIA 
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130 140 150 160 170 180 

190 200 210 220 230 240 

orf 26 - 1 . pep TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFWAWFSFDIGSMARFEQAALNE 
I M M I I I I I I I M I I I I I I M I I .1 I I 1 I I I I I I I I I I I M I I I I I I I I I I I I I I :| 
5 orf26ng TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFWAWFSFDIGSMARFEQAALNE 

190 200 210 220 230 240 



10 



250 260 270 280 290 300 

orf 26 - 1 . pep AHDETAVSDATKGRVYALI IPVLALIASTVSAMI YTGAQASETFS ILGAFENTDVNTSLV 

'■I I hi II I ! I I I I I I I II I I I I I I II M I I I I I I I I I I I i I ! I I I I I I I I I M I I 
orf 26ng AQDETAASDATKGRVYAL I IPVLALIASTVSAMI YTGAQASETFS ILGAFENTDVNTSLV 

250 260 270 280 290 300 



15 



310 320 330 340 350 360 

orf 26 - 1 . pep FGGTCGVLAWLCTLGT I KTADYPKAVWQGAKSMFGA I AI L I LAWL I STWGEMHTGDYL 

lllllllllllllhlllllllllMIIIIIIIIIIMIIIIIMIIIIIIIIIIIIIM 

or f 2 6ng FGGTCGVLAWLCTFGT I KTADYPKAVWQGAKSMFGAIAI LI LAWL I STWGEMHTGDYL 

310 320 330 340 350 360 



20 



370 380 390 400 410 420 

orf 26-1 .pep STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA 

MM llllll MIIIIIIIIIIIMIIIIIIIIIhlMIII IIIIIIIIIIIIIM 

orf26ng STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA 

370 380 390 400 410 420 



25 



430 440 450 460 470 480 

orf 26-1 .pep VMAGAVCGDHCS P I SDTT I LS STGARCNH I DHVTSQLPYALTVAAAAASGYLALGLTKS A 

1 1 1 M 1 1 1 1 1 1 1 1 1 M II 1 1 MM I M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 MM 1 1 I 

orf26ng VMAGAVCGDHCS PI SDTT I LS STGARCNH I DHVTSQLPYALTVAAAAASGYLALGLTKS A 

430 440 450 460 470 480 



30 



490 500 
orf 26-1 .pep LLGFGTTG I VLAVL I FLLKDKKRANAX 

I I I I I II II I I II I I II I M I I I- 
or f 2 6 ng LLGFGTTG I VLAVL I FLLKDKKRADVX 

490 500 



In addition, ORF26 ng (SEP ID NO: 696) shows significant homology to a hypothetical 
H. influenzae protein (SEP ID NO: 1156) : 



35 



40 



45 



sp I P44263 | YF86_HAEIN HYPOTHETICAL PROTEIN HI1586 ) gi | 1074850 | pir | | C64037 
hypothetical 

protein HI1586 - Haemophilus influenzae (strain Rd KW20) )gi| 1574427 (U32832) H. 
influenzae predicted coding region HI1586 [Haemophilus influenzae] Length = 519 
Score = 538 bits (1370), Expect = e-152 

Identities = 280/507 (55%), Positives = 346/507 (68%), Gaps = 7/507 (1%) 

Query: 1 MQLIDYSHSFFSWPPFLALALAVITRRXXXXXXXXXXXXXAFLVGGNPVDGLTHLKDMV 60 

M+LID+S S +S+VP LA+ LA+ TRR L +L V 

Sbjct : 14 MELIDFSSSVWSIVPALLAIILAIATRRVLVSLSAGIIIGSLMLSDWQIGSAFNYLVKNV 73 

Query: 61 VGLAWADGDWSLGKPKILVFLILLGIFTSLL^YSGSNQAFADWAKRHIKNRCGAKMLTAC 120 

V L +ADG+ + I++FL+LLG+ T+LLT SGSN+AFA+WA+ IK R GAK+L A 

Sbjct: 74 VSLVYADGEIN-SNMNIVLFLLLLGVLTALLTVSGSNRAFAEWAQSRIKGRRGAKLLAAS 132 



Query: 121 LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRAKLAYILDSTAS PMCVLMPVSSWGASI IA 180 



CHIR-0160 (356.001) 



-517- 



PATENT 



LVFVTFIDDYFHSLAVGAIARPVTD+FKVSRAKLAYILDSTA+PMCV+MPVSSWGA 1 1 
Sbjct: 133 LVFVTFIDDYFHSLAVGAIARPVTDRFKVSRAKLAYILDSTAAPMCVMMPVSSWGAYIIT 192 

Query: 181 TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE 24 0 
+ GLL TY ITEYTP+G FVAMS MN+YA+F++IMVF VA+FSFDI SM R E+ AL 
5 Sbjct: 193 LIGGLIATYSITEYTPIGAFVAMSSMNFYAIFSIIMVFFVAYFSFDIASMVRHEKLALKN 252 

Query: 241 AQDETAASDATKGRVYAL 1 1 PVLAL I AS TVS AM I YTGAQA SETFS ILGAFENTDVN 296 

+D+ TKG+V LI+P+L LI +TVS MIYTGA+A + FS+LG FENT V 

Sbjct: 253 TEDQLEEETGTKGQVRNLILPILVLIIATVSMMIYTGAEAIiAADGKVFSVLGTFENTWG 312 

Query: 297 TSLVFGGTCGVL- - AWLCTFGTIKTADYPKAVWQGAKSMFGXXXXXXXXXXXSTWGEM 354 
10 TSLV GG C ++ + + + + +Y ++ G KSM G + +VG+M 

Sbjct: 313' TSLWGGFCSIIISTLLIILDRQVSVPEYVRSWIVGIKSMSGAIAILFFAWTINKIVGDM 372 

Query: 355 HTGDYLSTLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALI 414 

TG YLS+LV+GNI FLPVILF+L + MAF+TGTSWGTFGIMLPIAAAMA P L+ 
Sbjct: 373 QTGKYLSSLVSGNIPMQFLPVILFVLGAAMAFSTGTSWGTFGIMLPIAAAMAANAAPELL 432 

15 Query: 415 IPCMSAVMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQXXXXXXXXXXXXXXXXXX 474 

+ PC + S AVMAGAVCGDHCS P + SDTT I LS STGA+CNH I DHVT+Q 
Sbjct: 433 LPCLS AVMAGAVCGDHCS PVSDTT I LSSTGAKCNHIDHVTTQLPYAATVAT ATS I GY I VV 492 

Query: 4 75 XXXKSALLGFGTTGIVLAVLI FLLKDK 501 
S L GF T + L V+IF +K + 
20 Sbjct: 493 GFTYSGLAGFAATAVSLI VI I FAVKKR 519 

Based on this analysis, it is predicted that these proteins from ^meningitidis and N. gonorrhoeae, 
and their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 83 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 697>] (SEP ID 
25 NO: 697) : 

1 . . AAGCAATGGT ATGCCGACGN . AGTATCAAG ACGGAAATGG TTATGGTCAA 

51 CGATGAGCCT GCCAAAATTC TGACTTGGGA TGAAAGCGGC CGATTACTCT 

101 CGGAACTGTC TATCCGCCAC CATCAACGCA ACGGGGTGGT TTTGGAGTGG 

151 TATGAAGATG GTTCTAAAAA GAGCGAAGT . GTTTATCAGG ATGACAAGTT 

30 201 GGTCAGGAAA ACCCAGTGGG ATAAGGATGG TTATTTAATC GAACCCTGA 

This corresponds to the amino acid sequence [<SEQ ID 698; ORF27>] (SEP ID NO: 698: 
PRF27) : 

1 . . KQWYADXSIK TEMVMVNDEP AKILTWDESG RLLSELSIRH HQRNGWLEW 
35 51 YEDGSKKSEX VYQDDKLVRK TQWDKDGYLI EP* 

Further work revealed the complete nucleotide sequence [<SEQ ID 699>] (SEP ID NP: 699) : 



1 ATGAAAAAAT TATCTCGGAT TGTATTTTCA ACTGTCCTGT TGGGTTTTTC 
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51 GGCCGCTTTG CCGGCGCAGA CCTATTCTGT TTATTTTAAT CAGAACGGAA 

101 AGCTGACGGC GACGATGTCT TCTGCCGCTT ATATCAGGCA ATATAGTGTG 

151 GTGGCGGGTA TTGCGCACGC GCAGGATTTT TATTATCCGT CGATGAAGAA 

201 ATATTCTGAA CCTTATATCG TTGCTTCAAC GCAAATCAAA TCTTTTGTGC 

251 CTACCCTGCA AAACGGTATG TTGATTTTGT GGCATTTTAA TGGTCAGAAA 

301 AAAATGGCGG GGGGCTTCAG CAAGGGTAAG CCGGACGGGG AGTGGGTCAA 

351 CTGGTATCCG AACGGTAAAA AATCTGCCGT TATGCCTTAT AAAAATGGCT 

4 01 TGAGTGAGGG TACGGGATAC CGCTATTACC GTAACGGCGG CAAGGAAAGC 

4 51 GAAATCCAGT TTAAGCAAAA TAAGGCAAAC GGCGTATGGA AGCAATGGTA 

501 TGCCGACGGC AGTATCAAGA CGGAAATGGT TATGGTCAAC GATGAGCCTG 

551 CCAAAATTCT GACTTGGGAT GAAAGCGGCC GATTACTCTC GGAACTGTCT 

601 ATCCGCCACC ATCAACGCAA CGGGGTGGTT TTGGAGTGGT ATGAAGATGG 

651 TTCTAAAAAG AGCGAAGCTG TTTATCAGGA TGACAAGTTG GTCAGGAAAA 

701 CCCAGTGGGA TAAGGATGGT TATTTAATCG AACCCTGA 

This corresponds to the amino acid sequence [<SEQ ID 700; ORF27-l>] (SEP ID NO: 700; 
PRF27-1) : 

1 MKKLSRIVFS TVLLGFSAAL PAQTYSVYFN QNGKLTATMS SAAYIRQYSV 

51 VAGIAHA QDF YYPSMKKYSE PYIVASTQIK SFVPTLQNGM LILWHFNGQK 

101 KMAGGFSKGK PDGEWVNWYP NGKKSAVMPY KNGLSEGTGY RYYRNGGKES 

151 EIQFKQNKAN GVWKQWYADG S I KTEMVMVN DEPAKILTWD ESGRLLSELS 

201 IRHHQRNGW LEWYEDGSKK SEAVYQDDKL VRKTQWDKDG YLIEP* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N .meningitidis (strain A) 

ORF27 (SEP ID NO: 698) shows 91.5% identity over a 82aa overlap with an ORF (ORF27a) 
(SEP ID NO: 702) from strain A of N. meningitidis: 



10 20 30 

orf 27 .pep KQWYADXSIKTEMVMVNDEPAKILTWDESG 

I I I I I I :|| Ml III Mllllllllllll 
o r f 2 7 a LS EGTGXR Y YRNGGKES E I QFKQNKANGVWKQW YADGN I KTEMVMVNDE P AKI LTWDESG 

140 150 160 170 180 190 



40 50 60 70 80 

orf 27 . pep RLLSELSIRHHQRNGWLEWYEDGSKKSEXVYQDDKLVRKTQWDKDGYLIEPX 

Illllllhll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I Mllllllllllll llllllll 

orf 27a RLLSELSIHHHXRNGWLEWYEDGSKKXEAVYQDDKLVRKTQWDXDGYLIEPX 
200 210 220 230 240 



The complete length PRF27a nucleotide sequence [<SEQ ID 701 >] fSEP ID NP: 701) is: 



1 ATGAAAAAAT TATCTCGGAT TGTATTTTCA ACTGTCCTGT TGGGTTTTTC 

51 GGCCGCTTTG CCGGCGCAGA NCTATTCTGT TTATTTTAAT CAGAACGGGA 

101 AACTGACGGC GACGNTGTCT TCTGCCGCNT ATATCAGGCA ATATAGTGTG 

151 GCGGAGGGTA TTGCGCACGC GCAGGANTTT TANTATCCGT CGATGAAGAA 

201 ATATTCCGAA CCTTATATCG TTGCTTCAAC GCAAATCAAA TCTTTTGTGC 

251 CTACCCTGCA AAACGGTATG TTGATTTTGT GGCATTTTAA NGGTCAGAAA 

301 AAAATGGCNG GGGGCTTCAG CAAGGGTAAG CCGGACGGGG AGTGGGTCAA 

3 51 CTGGTATCCG AACGGTAAAA AATCTGCCGT TATGCCTTAT AAAAATGGTT 
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4 01 TGAGTGAAGG TACGGGGTNN CGCTATTACC GTAACGGCGG CAAGGAAAGC 

4 51 GAAATCCAGT TTAAACAGAA TAAGGCAAAC GGCGTATGGA AGCAATGGTA 

501 TGCCGACGGC AATATCAAAA CGGAAATGGT TATGGTCAAT GATGAGCCTG 

551 CCAAAATTCT GACATGGGAT GAAAGCGGTC GATTACTCTC GGAACTGTCT 

601 ATCCATCATC ATNAACGTAA TGGAGTAGTC TTAGAGTGGT ATGAAGATGG 

651 TTCTAAAAAG ANTGAAGCTG TTTATCAGGA TGATAAGTTG GTCAGGAAAA 

701 CCCAGTGGGA TAANGATGGT TATTTAATCG AACCCTGA 

This encodes a protein having amino acid sequence [<SEQ ID 702>] (SEP ID NO: 702) : 



10 1 MKKLSRIVFS TVLLGFSAAL PAQXYSVYFN QNGKLTATXS SAAYIRQYSV 

51 AEGIAHA QXF XYPSMKKYSE PYIVASTQIK SFVPTLQNGM LILWHFXGQK 

101 KMAGGFSKGK PDGEWVNWYP NGKKSAVMPY KNGLSEGTGX RYYRNGGKES 

151 EIQFKQNKAN GVWKQWYADG NIKTEMVMVN DEPAKILTWD ESGRLLSELS 

201 IHHHXRNGW LEWYEDGSKK XEAVYQDDKL VRKTQWDXDG YLIEP* 

15 

ORF27a (SEP ID NO: 702) and ORF27-1 (SEP ID NO: 700) show 94.7% identity in 245 aa 
overlap: 



20 



10 20 30 40 50 60 

orf 27a . pep MKKLSRIVFSTVLLGFSAALPAQXYSVYFNQNGKLTATXSSAAYIRQYSVAEGIAHAQXF 
I I I I I II I I I T I I I I I II I I M I I I I I I M I I I I I MIMMIMM Mill | 
orf 2 7 - 1 MKKLSRIVFSTVLLGFSAALPAQTYSVYFNQNGKLTATMSSAAYIRQYSWAGIAHAQDF 

10 20 30 40 50 60 



25 



70 80 90 100 110 120 

orf 2 7a. pep XYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFXGQKKMAGGFSKGKPDGEWVNWYP 

1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M 1 1 M IMIIIMIM IMMIIMi 

orf 2 7-1 YYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFNGQKKMAGGFSKGKPDGEWVNWYP 

70 80 90 100 110 120 



30 



130 140 150 160 170 180 

orf 27a . pep NGKKSAVMPYKNGLSEGTGXRYYRNGGKESEIQFKQNKANGVWKQWYADGNIKTEMVMVN 

IIIIIIIMI llllllll 1 1 ! 1 1 1 1 1 M II 1 1 1 II 1 1 1 1 II II 1 1 1 1 M 1 1 1 M 1 1 

orf 27-1 NGKKSAVMPYKNGLSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVN 

130 140 150 160 170 180 



35 



190 200 210 220 230 240 

orf 2 7a. pep DEPAKILTWDESGRLLS ELS I HHHXRNGWLEWYEDGSKKXEAVYQDDKL VRKTQWDXDG 

Illlllllllllllllllllhll lllllllllllllll llllllllllllllll II 
orf27-l DEPAKILTWDESGRLLS ELS I RHHQRNGVVLEWYEDGSKKSEAVYQDDKLVRKTQWDKDG 

190 200 210 220 230 240 



orf 27a. pep YLIEPX 
40 | | | | | | 

orf27-l YLIEPX 

Homology with a predicted ORF from N. gonorrhoeae 



PRF27 (SEP ID NP: 698) shows 96.3% identity over 82 aa overlap with a predicted PRF 
(PRF27ng) (SEP ID NP: 704) from N. gonorrhoeae: 
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orf27.pep KQWYADXS I KTEMVMVNDE PAKI LTWDES G 30 

MINI II Mill I Mill I INI Mill 

orf 2 7ng LSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVl^ 193 

orf 27 .pep RLLSELS I RHHQRNGWLEWYEDGS KKS EX VYQDDKLVRKTQWDKDGYL I E P 82 

Illlllllllhlllllllllllllllll II I I I I I I I I I I I I I I I I I I I I 
orf27ng RLLSELS IRHHKRNGWLEWYEDGSKKSEAVYQDDKLVRKTQWDKDGYLIEP 24 5 

The complete length ORF27ng nucleotide sequence [<SEQ ID 703>] (SEP ID NO: 703) is: 



1 ATGAAGAAAT TATCTCGGAT TGTATTTTCA ATCGTACTGT TGGGTTTTTC 

51 GGCCGCTTTG CCGGCGCAGA CCTATTCTGT TTATTTTAAT CAGAACGGGA 

101 AACTGACGGC GACGATGTCT TCTGCCGCTT ATATCAGGCA ATATAGTGTG 

151 GCGGCGGGTA TCGCACACGC GCAGGATTTT TATTATCCGT CGATGAAGAA 

201 ATATTCCGAA CCTTATATCG TTGCTTCAAC GCAAATCAAA TCTTTTGTGC 

251 CTACCCTGCA AAACGGTATG TTGATTTTGT GGCATTTTAA TGGTCAGAAA 

3 01 AAAATGGCGG GGGGCTTCAG CAAGGGTAAG CCGGACGGGG AATGGGTCAA 
351 CTGGTATCCG AACGGTAAAA AATCTGCGGT TATGCCTTAT AAAAATGGCT 

4 01 TGAGTGAGGG TACGGGATAC CGTTATTACC GTAACGGCGG CAAGGAAAGC 
4 51 GAAATCCAGT TTAAGCAAAA TAAGGCGAAC GGCGTATGGA AGCAATGGTA 
501 TGCCGATGGA AGTATCAAGA CGGAAATGGT TATGGTCAAC GATGAGCCTG 
551 CCAAAATTCT GACTTGGGAT GAAAGCGGCC GATTACTTTC GGAACTGTCT 
601 ATCCGCCACC ATAAACGCAA CGGGGTGGTT TTGGAGTGGT ATGAAGATGG 
651 TTCTAAAAAG AGCGAGGCTG TTTATCAGGA TGACAAGTTG GTCAGGAAAA 
701 CCCAATGGGA TAAGGATGGT TATTTAATCG AACCCTGA 



This encodes a protein having amino acid sequence [<SEQ ID 704>] (SEP ID NO: 704) : 



1 MKKLSRIVFS IVLLGFSAAL PA QTYSVYFN QNGKLTATMS SAAYIRQYSV 

51 AAGIAHAQDF YYPSMKKYSE PYIVASTQIK SFVPTLQNGM LILWHFNGQK 

101 KMAGGFSKGK PDGEWVNWYP NGKKSAVMPY KNGLSEGTGY RYYRNGGKES 

151 EIQFKQNKAN GVWKQWYADG SIKTEMVMVN DEPAKILTWD ESGRLLSELS 

201 IRHHKRNGW LEWYEDGSKK SEAVYQDDKL VRKTQWDKDG YLIEP* 

PRF27ng (SEP ID NP: 704) and PRF27-1 (SEP ID NP: 700) show 98.8% identity in 245 
overlap: 



10 20 30 40 50 60 

orf 27- 1 . pep MKKLSRIVFSTVLLGFSAALPAQTYSVYFNQNGKLTATMSSAAYIRQYSWAGIAHAQDF 

IIMIIIIII IMIIIIIIIIIIIIIIIIIIIIIIIIMII M Ihllll Mil 

orf27ng MKKLSRIVFS I VLLGFSAALPAQTYSVYFNQNGKLTATMSSAAY I RQYSVAAGIAHAQDF 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 2 7-1 .pep YYPSMKKYSEPYIVASTQIKS FVPTLQNGMLILWHFNGQKKMAGGFSKGKPDGEWVNWYP 

1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i II 1 1 1 1 1 1 1 1 1 1 i 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 i I 

orf 2 7ng YYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFNGQKKMAGGFSKGKPDGEWVNWYP 

70 80 90 100 110 120 

130 140. 150 160 170 180 

or f 2 7 - 1 . pep NGKKSAVMPYKNGLSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGS I KTEMVMVN 

1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 

orf2 7ng NGKKSAVMPYKNGLSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVN 

130 140 150 160 170 180 



-521- 



PATENT 



orf 27-1 .pep 



190 200 210 220 230 240 

DEPAKILTWDESGRLLSELSIRHHQRNGWLEWYEDGSKKSEAVYQDDKLVRKTQWDKDG 



orf 27ng 




5 



190 200 210 220 230 240 



orf 2 7-1. pep 



YLIEPX 



orf27ng 



Mill! 

YLIEPX 



10 Based on this analysis, including the putative leader sequence in the gonococcal protein, it was 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

ORF27-1 (SEP ID NO: 700) (24.5kDa) was cloned in pET and pGex vectors and expressed in 
E.colU as described above. The products of protein expression and purification were analyzed by 
15 SDS-PAGE. Figure 17A shows the results of affinity purification of the GST-fusion protein, and 
Figure 17B shows the results of expression of the His-fusion in E.colu Purified GST- fusion protein 
was used to immunise mice, whose sera were used for ELISA, which gave a positive result, 
confirming that ORF27-1 (SEP ID NO: 700) is a surface-exposed protein and a useful immunogen. 

Example 84 

20 The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 705>] (SEO ID 
NO: 705) : 



This corresponds to the amino acid sequence [<SEQ ID 706; ORF47>] fSEO ID NO: 706: 
35 ORF47) : 



25 



30 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 



1 



ATGAAATTTA CCAAGCACCC CGTCTGGGCA 
TTCGCTGGCG GCTCTGTACG GCGCATTGTC 
GCTACACGGG AACGCACkAG CTGTCCGGTT 
ATGATTTGGG GTTATGCCGG ACTGGTCGTC 
CGTCGCCACT TGGACGGGGC AGCCGCCCAC 
GGCTTGACTA TCTTTTGGCT GGCTGCGCGG 
TTGGGGTGCG TCGGCAAGCG GCATACTCGG 
GCGCGGTGTG CATGGCTTTG CCCGTTATCC 
TATGTTgCCG TGTTCGCGCT GTTCGTCTTG 
CCACGTCCAG CTGCACAACG GCAACCTAGG 
AGTCGGGCTT GGTGATG 



ATGGCGTTCC GCCCATTTTA 
CGTATTGCTG TGGGGTTTCG 
TCTATTGGCA CGCGCATGAg 
ATCGCCTTCC TGCTGACCGC 
GCGGGGCGGC GTaTCTGGTC 
ATTGCCGCCT TTATCCCGGG 
TACGCTGTTT TTCTGGTACG 
GTTCGCAGAA TCAACGCAAC 
GGCGGCACGC ATGCGGCGTT 
CGGACTCTTG AGCGGATTGC 



1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHX LSGFYWHAHE 
51 MIWGYAGLW IAFLLTAVAT WTGQPPTRGG VLVGLTIFWL AARIAAFIPG 
101 WGASASGILG TLFFWYGAVC MALPVIRSQN QRNYVAVFAL FVLGGTHAAF 
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151 HVQLHNGNLG GLLSGLQSGL VM 

Further work revealed the complete nucleotide sequence [<SEQ ID 707>] (SEP ID NO: 707) : 

1 ATGAAATTTA CCAAGCACCC CGTCTGGGCA ATGGCGTTCC GCCCATTTTA 

51 TTCGCTGGCG GCTCTGTACG GCGCATTGTC CGTATTGCTG TGGGGTTTCG 

101 GCTACACGGG AACGCACGAG CTGTCCGGTT TCTATTGGCA CGCGCATGAG 

151 ATGATTTGGG GTTATGCCGG ACTGGTCGTC ATCGCCTTCC TGCTGACCGC 

201 CGTCGCCACT TGGACGGGGC AGCCGCCCAC GCGGGGCGGC GTTCTGGTCG 

251 GCTTGACTAT CTTTTGGCTG GCTGCGCGGA TTGCCGCCTT TATCCCGGGT 

3 01 TGGGGTGCGT CGGCAAGCGG CATACTCGGT ACGCTGTTTT TCTGGTACGG ^ 

3 51 CGCGGTGTGC ATGGCTTTGC CCGTTATCCG TTCGCAGAAT CAACGCAACT 

4 01 ATGTTGCCGT GTTCGCGCTG TTCGTCTTGG GCGGCACGCA TGCGGCGTTC 
4 51 CACGTCCAGC TGCACAACGG CAACCTAGGC GGACTCTTGA GCGGATTGCA 
501 GTCGGGCTTG GTGATGGTGT CGGGTTTTAT CGGTCTGATT GGTACGCGGA 
551 TTATTTCGTT TTTTACGTCC AAACGCTTGA ATGTGCCGCA GATTCCCAGT 
601 CCGAAATGGG TGGCGCAGGC TTCGCTGTGG CTGCCCATGC TGACTGCCAT 
651 GCTGATGGCG CACGGTGTGT TGGCTTGGCT GTCTGCCGTT TTTGCCTTTG 
701 CGGCAGGTGT GATTTTTACC GTGCAGGTGT ACCGCTGGTG GTATAAACCC 
751 GTGTTGAAAG AGCCGATGCT GTGGATTCTG TTTGCCGGCT ATCTGTTTAC 
801 CGGATTGGGG CTGATTGCGG TCGGCGCGTC TTATTTCAAA CCCGCTTTCC 
851 TCAATCTGGG TGTGCATCTG ATCGGGGTCG GCGGTATCGG CGTGCTGACT 
901 TTGGGCATGA TGGCGCGTAC CGCGCTTGGT CATACGGGCA ATCCGATTTA 
951 TCCGCCGCCC AAAGCCGTTC CCGTTGCGTT TTGGCTGATG ATGGCGGCAA 

1001 CCGCCGTCCG TATGGTTGCC GTATTTTCTT CCGGCACTGC CTACACGCAC 

1051 AGCATCCGCA CCTCTTCGGT TTTGTTTGCA CTCGCGCTTT TGGTGTATGC 

1101 GTGGAAGTAT ATTCCTTGGC TGATTCGTCC GCGTTCGGAC GGCAGGCCCG 

1151 GTTGA 

This corresponds to the amino acid sequence [<SEQ ID 708; ORF47-l>] (SEP ID NO: 708; 
PRF47-1) : 

1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE 

51 MIWGYAGLW IAFLLTAVAT WTGQPPTRGG VLVGLTIFWL AARIAAFIPG 

101 WGASASGILG TLFFWYGAVC MALPVIRSQN QRNYVAVFAL FVLGGTHAAF 

151 HVQLHNGNLG GLLSGLQS GL VMVSGFIGLI GTRI I SFFTS KRLNVPQIPS 

2 01 PKWVAQASLW LPMLTAMLMA HGVLAWLSAV FAFAAGVIFT VQVYRWWYKP 

2 51 VLKEPMLWIL FAGYLFTGLG LIAVGASYFK PAFLNLGVHL IGVGGIGVLT 

3 01 LGMMARTALG HTGNPIYPPP KAVP VAFWLM MAATAVRMVA V FSSGTAYTH 
3 51 SIRTSSVLFA LALLVYA WKY IPWLIRPRSD GRPG* 

Computer analysis of this amino acid sequence predicts a leader peptide and also gave the 
following results: 

Homology with a predicted PRF from N. meningitidis (strain A) 

PRF47 (SEP ID NP: 706) shows 99.4% identity over a 172aa overlap with an PRF (PRF47a) 
(SEP ID NP: 710) from strain A of N. meningitidis: 

10 20 30 40 50 60 

or f 4 7 . pep MKFTKHPWAMAFRPFYSLAALYGALSVLLWGFGYTGTHXLSGFYWHAHEM IWGYAGLVV 

MIMIIIIMMMIIIIIIIIIIMIIMIIIIIIII 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 
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or f 4 7a MKFTKHPWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEM IWGYAGLVV 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 47 . pep IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC 

Mill I Ml 1 1 III 1 1 II I lllllllllll MM INI MINI I MINIMI Mill I 

orf 4 7a I AFLLTAVATWTGQPPTRGGVLVGLTI FWLAARIAAFI PGWGASASGILGTLFFWYGAVC 

70 80 90 100 110 120 

130 140 150 160 170 

or f 4 7 . pep MALPVIRSQNQRN YVAVFALFVLGGTHAAF HVQLHNGNLGGLLSGLQSGLyM 

TTTl II I II I II II I II II I II II 1 1 1 II I II II II 1 1 II 1 1 1 1 II 1 1 1 1 M 

or f 4 7a ^4ALPVIRSQNQRN YVAVFALFVLGGTHAAF HVQLHNGNLGGLLSGLQS GLVMVSGFIGLI 

130 140 150 160 170 180 



Orf 4 7a GTRII SFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVMPWLSAAFAFAAGVIFT 

190 200 210 220 230 240 

The complete length ORF47a nucleotide sequence [<SEQ ID 709>] (SEP ID NO: 709) is 



1 


ATGAAATTTA 


CCAAGCACCC 


51 


TTCACTGGCG 


GCTCTGTACG 


101 


GCTACACGGG 


AACGCACGAG 


151 


ATGATTTGGG 


GTTATGCCGG 


201 


CGTCGCCAGT 


TGGACGGGGC 


251 


GCTTGACTAT 


CTTTTGGCTG 


301 


TGGGGTGCGT 


CGGCAAGCGG 


351 


CGCGGTGTGC 


ATGGCTTTGC 


401 


ATGTTGCCGT 


GTTCGCGCTG 


451 


CACGTCCAGC 


TGCACAACGG 


501 


GTCGGGCTTG 


GTGATGGTGT 


551 


TTATTTCGTT 


TTTTACGTCC 


601 


CCGAAATGGG 


TGGCGCAGGC 


651 


GCTGATGGCG 


CACGGCGTGA 


701 


CGGCAGGTGT 


GATTTTTACC 


751 


GTGTTGAAAG 


AGCCGATGCT 


801 


CGGATTGGGG 


CTGATTGCGG 


851 


TCAATCTGGG 


TGTGCATCTG 


901 


TTGGGCATGA 


TGGCGCGTAC 


951 


TCCGCCGCCC 


AAAGCCGTTC 


1001 


CCGCCGTCCG 


TATGGTTGCC 


1051 


AGCATACGCA 


CCTCTTCGGT 


1101 


GTGGAAGTAT 


ATTCCTTGGC 


1151 


GTTGA 





CGTTTGGGCA ATGGCGTTCC GCCCGTTTTA 
GCGCATTGTC CGTATTGCTG TGGGGTTTCG 
CTGTCCGGTT TCTATTGGCA CGCGCATGAG 
ACTGGTCGTC ATCGCCTTCC TGCTGACCGC 
AGCCGCCCAC GCGGGGCGGC GTTCTGGTCG 
GCTGCGCGGA TTGCCGCCTT TATCCCGGGT 
CATACTCGGT ACGCTGTTTT TCTGGTACGG 
CCGTTATCCG TTCGCAGAAT CAACGCAATT 
TTCGTCTTGG GCGGTACGCA CGCGGCGTTC 
CAACCTAGGC GGACTCTTGA GCGGATTGCA 
CGGGTTTTAT CGGTCTGATT GGTACGCGGA 
AAACGGTTGA ATGTGCCGCA GATTCCCAGT 
TTCGCTGTGG CTGCCCATGC TGACCGCCAT 
TGCCTTGGCT GTCGGCGGCT TTCGCGTTTG 
GTGCAGGTGT ACCGCTGGTG GTATAAGCCT 
GTGGATTCTG TTTGCCGGCT ATCTGTTTAC 
TCGGCGCGTC TTATTTCAAA CCCGCTTTCC 
ATCGGGGTCG GCGGTATCGG CGTGCTGACT 
CGCGCTCGGT CATACGGGCA ATCCGATTTA 
CCGTTGCGTT TTGGCTGATG ATGGCGGCAA 
GTATTTTCTT CCGGCACTGC CTACACGCAC 
TTTGTTTGCA CTCGCGCTTT TGGTGTATGC 
TGATTCGTCC GCGTTCGGAC GGCAGGCCCG 



This encodes a protein having amino acid sequence [<SEQ ID 710>] (SEP ID NO: 710) : 



1 MfCFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE 

51 MIWGYAGLW IAFLLTAVAT WTGQPPTRGG VLVGLTIFWL AARIAAFIPG 

101 WGASASGILG TLFFWYGAVC MALPVIRSQN QRNYVAVFAL FVLGGTHAAF 

151 HVQLHNGNLG GLLSGLQS GL VMVSGFIGLI GTRII SFFTS KRLNVPQIPS 

201 PKWVAQASLW LPMLTAMLMA HGVMPWLSAA FAFAAGVIFT VQVYRWWYKP 

251 VLKEPMLWIL FAGYLFTGLG LIAVGASYFK PAFLNLGVHL IGVGGIGVLT 

301 LGMMARTALG HTGNPIYPPP KAVP VAFWLM MAATAVRMVA V FSSGTAYTH 

351 SIRTSSVLFA LALLVYAWKY IPWLIRPRSD GRPG* 
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ORF47a fSEO ID NO: 710) and ORF47-1 (SEP ID NO: 708) show 99.2% identity in 384 aa 
overlap: 

10 20 30 40 50 60 

or f 4 7a . pep MKFTKHPWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV 

5 i 1 1 ! I - 1 i 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III I 1 1 3 1 )! 1 1 1 ! 1 1 1 1 1 1 ! 1 1 1 1 1 1 ! 1 1 

or f 4 7 - 1 MKFTKHPWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV 

10 20 30 40 50 60 

70 80 90 ' 100 110 120 

orf4 7a.pep IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC 

10 I INI MIMIMIIIIIIIIIIMI MM IIIIIIIIIIIIIIIIIMII MM 

orf 4 7-1 IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 4 7a . pep MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI 

15 1 1 1 M M 1 1 1 1 1 1 1 1 M I M II 1 1 Ml II 1 1 M I II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 

orf 4 7 - 1 MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 4 7a. pep GTRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVMPWLSAAFAFAAGVIFT 

20 M I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I II I I I I I I : I I I I : I I I I I I I I I I 

orf 4 7-1 GTRI ISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVLAWLSAVFAFAAGVIFT 

190 200 210 220 230 240 

250 260 270 280 290 300 

or f 4 7a . pep VQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYFKPAFLNLGVHLIGVGGIGVLT 

25 | | 1 1 1 1 | 1 1 1 1 | 1 1 1 1 | || 1 1 | 1 1 1 1 | 1 1 1 1 1 1 | 1 1 | | | | | || 1 1 | | | | 1 1 1 1| | | 1 1 1 1 

or f 4 7-1 VQ VYRWWY KP VLKE PMLW I L F AG YL FTGLGL I AVGAS Y FKPAFLNLGVHL I GVGG I GVLT 

250 260 270 280 290 300 

310 320 330 340 350 360 

orf 4 7a. pep LGMMARTALGHTGNPIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA 

30 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

orf 47-1 LGMMARTALGHTGNPIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA 

310 320 330 340 350 360 

370 380 
orf 4 7a . pep LALLVYAWKY I PWL IRPRSDGRPGX 

35 J | | || | | | | | | | | | | | | | | | | | | | | 

orf47-l LALLVYAWKY I PWL I RPRSDGRPGX 

370 380 

Homology with a predicted ORF from N. gonorrhoeae 

ORF47 (SEP ID NO: 706) shows 97.1% identity over 172 aa overlap with a predicted ORF 
40 (ORF47ng) ( SEP ID NO: 712) from N. gonorrhoeae: 

ORF4 7 MKFTKHPVWAMAFRPFYSLAALYGALS VLLWGFGYTGTHELSGFYWHAHEMI WGYAGLW 6 0 

1 1 1 1 1 1 1 i 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

ORF47ng MKFTKHPVWAMAFRP FYS LAALYGALSVLLWGFGYTGTHELSGFYWHAHEM I WGYAGLW 60 
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ORF4 7 



IAFLLTAVATWTGQPPTRGGVLVGLTI FWLAARIAAFI PGWGASASGILGTLFFWYGAVC 120 



ORF4 7ng 




ORF4 7ng 



ORF4 7 




The ORF47ng nucleotide sequence [<SEQ ID 71 1>] (SEP ID NO: 711) is predicted to encode a 
protein comprising amino acid sequence [<SEQ ID 712>] (SEP ID NO: 712) : 



1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE 

51 MIWGYAGLW IAFLLTAVAT WTGQPPTRGG VLVGLTAFWL AARIAAFIPG 

101 WGAAASGILG TLFFWYGAVC MALPVIRSQN RRNYVAVFAI FVLGGTHAAF 

151 HVQLHNGNLG GLLSGLQS GL VMVWGFIGLI GMKII SFFTS KRLKLPQIPS 

201 PKWVAHASLW LPMLNAILMA HRVMPW LSAA FPFAAGVIFT VQVY AGGITP 

2 51 IEETSCGSVA GICYRLGNSS G 



The predicted leader peptide and transmembrane domains are identical (except for an Ile/Ala 
substitution at residue 87 and an Leu/Ile substitution at position 140) to sequences in the 
meningococcal protein (see also Pseudomonas stutzeri orf396 (SEP ID NO: 1 157) , accession 
number e246540): 

TM segments in ORF4 7ng 



INTEGRAL 


Likelihood 




-5 , 


.63 


Transmembrane 


52 


- 68 


INTEGRAL 


Likelihood 




-3 


.88 


Transmembrane 


169 


- 185 


INTEGRAL 


Likelihood 




-3 , 


.08 


Transmembrane 


82 


- 98 


INTEGRAL 


Likelihood 




-1, 


.91 


Transmembrane 


134 


- 150 


INTEGRAL 


Likelihood 




-1 , 


.44 


Transmembrane 


107 


- 123 


INTEGRAL 


Likelihood 




-1 


.38 


Transmembrane 


227 


- 243 



Further work revealed the complete gonococcal DNA sequence [<SEQ ID 713>] (SEP ID NP: 



1 ATGAAATTTA CCAAACATCC CGTCTGGGCA ATGGCGTTCC GCCCGTTTTA 

51 TTCACTGGCG GCACTGTACG GCGCATTGTC CGTATTGCTG TGGGGTTTCG 

101 GCTACACGGG AACGCACGAG CTGTCCGGTT TCTATTGGCA CGCGCATGAG 

151 ATGATTTGGG GTTATGCCGG TCTCGTCGTC ATCGCCTTCC TGCTGACCGC 

201 CGTCGCCACT TGGACGGGAC AGCCGCCCAC GAGGGGCGGC GTTCTGGTCG 

251 GCTTGACCGC CTTTTGGCTG GCTGCGCGGA TTGCCGCCTT- TATCCCGGGT 

3 01 TGGGGTGCGG CGGCAAGCGG CATACTCGGT ACGCTGTTTT TCTGGTACGG 
351 CGCGGTGTGC ATGGCTTTGC CCGTTATCCG TtcgCAAAAC CGGCGCAACT 

4 01 ATGtcgCCGT ATTCGCAATA TTTGTGCTGG GCGGTACGCA TGCGgcgTTC 
4 51 CACGtccAgc tGCACAACGG CAACCTAGGC GGACTCTTGA GCGGATTGCA 
501 GTCGGGCCTG GTTATGGTGT CGGGCTTTAT CGGCCTGATT GGGATGAGGA 
551 TTATTTCGTT TTTTACGTCC AAACGGTTGA ACGTGCCGCA GATTCCCAGT 
601 CCGAAATGGG TGGCGCAGGC TTCGCTGTGG CTACCCATGC TGACCGCCAT 
651 ACTGATGGCG CACGGCGTGA TGCCTTGGCT GTCGGCGGCT TTCGCGTTTG 
701 CGGCGGGCGT GATTTTTACC GTACAGGTGT ACCGCTGGTG GTATAAACCC 
751 GTATTGAAAG AACCGATGCT GTGGATTCTG TTTGCCGGCT ATCTGTTTAC 
801 CGGATTGGGG CTGATTGCGG TCGGCGCGTC TTATTTCAAA CCTGCCTTCC 
851 TCAATCTGGG CGTACATCTG ATCGGGGTCG GCGGTATCGG CGTGCTGACT 



713) : 
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901 TTGGGCATGA TGGCGCGTAC 

951 TCCGCCGCCC AAAGCCGTTC 

1001 CCGCCGTCCG TATGGTTGCC 

1051 AGCATCCGCA CGTCTTCGGT 

1101 GTGGAAATAC ATTCCGTGGC 

1151 GTTGA 



CGCGCTCGGT CATACGGGCA ATTCGATTTA 
CCGTTGCGTT TTGGCTGATG ATGGCGGCAA 
GTATTTTCTT CCGGCACTGG CTACACGCAC 
TTTGTTTGCA CTCGCGCTGC TGGTGTATGC 
TGATCCGTCC GCGTTCGGAC GGCA<GGCCCG 



This encodes a protein having amino acid sequence [<SEQ ID 714; ORF47ng-l>] (SEP ID NO: 
714: PRF47ng-l) : 



1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE 

51 MIWGYAGLW IAFLLTAVAT WTGQPPTRGG VLVGLTAFWL AARIAAFIPG 

101 WGAAASGILG TLFFWYGAVC MALPVIRSQN RRNYVAVFAI FVLGGTHAAF 

151 HVQLHNGNLG GLLSGLQS GL VMVSGFIGLI GMRII SFFTS KRLNVPQIPS 

201 PKWVAQASLW LPMLTAILMA HGVMPWLSAA FAFAAGVIFT VQVYRWWYKP 

251 VLKEPMLWIL FAGYLFTGLG LIAVGASYFK PAFLNLGVHL IGVGGIGVLT 

301 LGMMARTALG HTGNSIYPPP KAVP VAFWLM MAATAVRMVA V FSSGTAYTH 

351 SIRTSSVLFA LALLVYA WKY IPWLIRPRSD GRPG* 

ORF47ng-l (SEP ID NO: 714) and ORF47-1 fSEP ID NO: 708) show 97.4% identity in 384 aa 
overlap: 



10 20 30 40 50 60 

or f 4 7 - 1 . pep MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV 

I I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I I I I I I I I ! I ! I II I I I I i I I I I I I 
orf47ng-l MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLW 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf47-l .pep IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I = I I I I I I I I I I I I I I I I 
orf 4 7ng- 1 I AFLLTAVATWTGQPPTRGGVLVGLTAFWLAARIAAF I PGWGAAASGILGTLFFWYGAVC 

70 80 90 100 110 120 



130 140 150 160 170 180 

or f 4 7 - 1 . pep MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI 

lllllll llhlllMIIMllMMIIIMII MINIM IIIMIMMI lllllll 

orf 4 7ng- 1 ' MALPVIRSQNRRNYVAVFAIFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI 

130 140 150 160 170 180 



190 200 210 220 230 240 

or f 4 7 - 1 . pep GTRI ISFFTSKRLNVPQI PSPKWVAQASLWLPMLTAMLMAHGVLAWLSAVFAFAAGVI FT 

I MIIIIIIIIIIIIMI IIIIIIIMMilllhlll In IIIMIMMI II 

orf4 7ng-l GMR 1 1 S FFTS KRLNVPQ I PS PKWVAQASLWLPMLTA I LMAHGVMPWLS AAFAFAAGV I FT 

190 200 210 . 220 230 240 



250 260 270 280 290 300 

orf 4 7 - 1 . pep VQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYFKPAFLNLGVHLIGVGGIGVLT 

1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 M I M II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M I II M 1 1 M 1 1 M 

orf 4 7ng- 1 VQ VYRWWYKP VLKEPMLWIL FAG YLFTGLGL I AVGASYFKPAFLNLGVHL IGVGGIGVLT 

250 260 270 280 290 300 



310 320 330 340 350 360 

or f 4 7 - 1 . pep LGMKARTALGHTGNPIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA 

IMIMIIMM I M M 1 1 1 1 1 II 1 1 Ml I i 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 M I 

orf4 7ng-l LGMMARTALGHTGNSIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA 



CHER-0160 (356.001) 



-527- 



PATENT 



310 320 330 340 350 360 

370 380 
orf 47-1 .pep LALLVYAWKY I PWL IRPRSDGRPGX 

Illllllllllllllllllllllll 
5 orf4 7ng-l LALLVYAWKY I PWL IRPRSDGRPGX 

370 380 

Furthermore, ORF47ng-l (SEP ID NO: 714) shows significant homology to an ORF (SEP ID NO: 
1 157) from Pseudomonas stutzeri: 



10 gnl|PID|e246540 (Z73914) ORF396 protein [Pseudomonas stutzeri] Length = 396 

Score = 155 bits (389), Expect = 5e-37 

Identities = 121/391 (30%), Positives = 169/391 (42%), Gaps = 21/391 (5%) 





Query : 


/ 


D\7MaMaT?DDT?VCT a AT VPST Q\7T T WnPTiVTnTWPT .Qf^PY- - - WWAHFMTWflYAmW 
f V VVMJ v lH.r Krr I oJ-iAAL I unbo ViiiiWur ul lul nciLiour 1 HrLMXiJ&iM J. wvj ± nuu v 


59 








P+W +AFRPF+ +LY L++ LW +TG GF WH HEM++G+A + 




15 


Sbjct : 


14 


PIWRLAFRPFFLAGSLYALLAIPLWVAAWTGLWP- -GFQPTGGWLAWHRHEMLFGFAMAI 


71 




Query: 


60 


VIAFLLTAVATWTGQPPTRGGVLVGLTAFWLAARIAAFIPGWGAAASGILGTLFFWYGAV 


119 








V FLLTAV TWTGQ G LVGL A WLAAR+ + + G AA L LF 






Sbjct: 


72 


VAGFLLTAVQTWTGQTAPSGNRLVGLAAVWLAARL - GWL FGL P AAWLAPLDLL FL VALVW 


130 




Query: 


120 


CMALPVIRSQNRRNYVAVFAIFVLGGTHAAFXXXXXXXXXXXXXXXXXXXXXMVSGFIGL 


179 


20 






MA + + +RNY V + ++ G +V+ + L 






Sbjct: 


131 


MMAQMLWAVRQKRNYPIVWLSLMLGADVLILTGLLQGNDALQRQGVLAGLWLVAALMAL 


190 




Query : 


180 


I GMR IISFFTS KRLNVPQ I PS P - KWVAQAS LWL PMLTAI LMAHGV MPWLS AAFAFA 


234 








IG R+I FFT + L P W+ A L + A+L A GV P L FA 






Sbjct : 


191 


IGGRVIPFFTQRGLGKVDAVKPWVWLDVALLVGTGVIALLHAFGVAMRPQPLLGLLFV-A 


249 


25 


Query: 


235 


AGV I FTVQ VYRWWYKP VL KE PMLW I L F AG YL FTGLGL I AVG AS Y F - KP AFXXXXXXXXXX 


293 








GV + + + RW+ K + K +LW L L+ + + +F A 






Sbjct : 


250 


I GVGHLLRLMRWYDKG I WKVGLLWSLHVAMLWLWAAFGLALWHFGLLAQS S PS LHALS V 


309 




Query: 


2 94 


XXXXXXXXXMMARTALGHTGNSIYPPPKAVPVAFWLXXXXXXXXXXXXFSSGTAYTHSIR 


353 








M+AR LGHTG + P+AFL FS + 




30 


Sbjct: 


310 


GSMSGL I LAM I ARVTLGHTGRPLQLPAG 1 1 G - AFVL FNLGTAARVFLSVAWPVGGLW 


365 




Query : 


354 


TSSVLFALALLVYAWKYI PWLIRPRSDGRPG 3 84 










++V + LA +Y W+Y P L+ R DG PG 






Sbjct : 


366 


LAAVCWTLAFALYVWRYAPMLVAARVDGHPG 3 96 





Based on this analysis, it is predicted that the proteins from ^meningitidis and N. gonorrhoeae, and 
35 their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 85 



The following partial DNA sequence was identified in 'N. meningitidis [<SEQ ID 715>] (SEP ID 
NP:715) : 
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10 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



. ATGCCGTCTG 
AGCGCATGCC 
CGTCCCCAGT 
GATGTCGAAa 
TGTkGCTTTC 
CCTssCwsTG 
TkTTyyCACC 
CGCCGATATA 
GACAAGCCGA 
TGTTTGGGTT 
TAGAAGAATT 
ATTCTCCAGC 



AAGGTTCAGA 
CAATGAGACT 
TGTGGTAACG 
CCGACACCGG 
GTGATAGGsA 
kAGmGCCkTk 
GAATGAACyT 
GGGTTTGAAT 
GAGAAGAAAC 
TCTTTGTAGT 
ACTTTCTTTC 
CGCCGAAATC 



CGGCmTCGGT 
TCGTGGGTTT 
GTATCCGGTG 
CGATGACACC 
GGTTTGyTGG 
CkyTGGTkkA 
GATGTTTAAC 
TTATCGTTGA 
GGCGTGGAAG 
TGTTGTTTAT 
CATTTTCTGT 



GyCGGGGAAy 
TGAAGCGGGT 
TCyAArGTCA 
AAGACCyAmG 
kmksAsyTTG 
swGrwArTAG 
GTGTCCGTAG 
GTTTGAAATC 
CTGCCGTTTC 
CTCTTCAGTA 
AACTGGCATA 



CAGAAGyGGT 
GTTTTCCAAG 
GCTTGGGyGT 
CTGCTGATrC 
TAyrATwkkG 
TCGTGGTTTy 
GCGACGCGCG 
GTAAATGGCG 
CCTGATGTTT 
ACTTTTTTAG 
ATCTGCCGCT 



15 



This corresponds to the amino acid sequence [<SEQ ID 716; ORF67>] (SEP ID NO: 716; 
PRF67): 



1 . . MPSEGSDGXG XGEXEXVAHA QXDFVGFEAG VFQASPVWT VSGVXXQLGX 
51 ' DVETDTGDDT KTXAADXVAF VIGRFXGXXL YXXAXXXXAX XWXXXXSRGF 
101 XXHRMNLMFN VS VGDARAD I GFEFIVEFEI VNGGQAERRN GVEAAVSLMF 
151 CLGFFWWY LFSNFFSRRI TFFPFSVTGI ICRYSPAAEI . . 

20 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N. gonorrhoeae 



ORF67 (SEQ ID NO: 716) shows 51.8% identity over 199 aa overlap with a predicted ORF 
(ORF67ng) (SEQ ID NO: 7 1 8) from N. gonorrhoeae: 



25 or f 6 7 . pep MPSEGSDGXGXGEXEXVAHAQXDFVGFEAG 3 0 

Illlllll I II I Mill lllllll 
orf 67ng TNFEIAVLSGMTVRVFYCARPAPVNGGRLKMPSEGSDGIGIGESEAVAHAQRGFVGFEAG 14 6 

90 100 110 120 130 140 

orf 6 7 . pep VFQAS PVWTVSGVXXQLGXDVETDTGDDTKTXAADXVAFV I GRFXGXXL YXXAXXXXAX 9 0 

30 ' Mllllllhhll I I II = = ::: II Mhll I 

orf 6 7ng VFQAS PVWAVAGVQGQAGRDVYAHARHRAEAQAAAAVAFL I GVFLRMSVR I NRNCCVS I 2 06 

orf 67 . pep XWXXXXSRGFXXHRMNLMFNVSVGDARADIGFEFIVEFEIVNGGQAERRNGVEAAVSLMF 150 

I : h: : M II I II h I II II h I II I I I I I II II II I I I I II Ml 
orf 67ng TRVGGKSTCYFFSRIDAVSDVSVGDARTDIGFEFWEFEIVNGGQAERRNGVECAVFLMF 266 

35 orf 67. pep CLGFFW WYLFSNFFSRRITFF-PFSVTGI ICRYSPAAEI 190 

III :: |: |: : | : M till! Mllh 

orf 67ng RLLVFYVKLVAAKSFI ILSFQLFYVHGIFIWPFPVTGI IRGDAPAAEWADRHPGVDGM 326 

The ORF67ng nucleotide sequence [<SEQ ID 717>] (SEP ID NO: 717) is predicted to encode a 
40 protein comprising amino acid sequence [<SEQ ID 718>] f SEP ID NO: 718V . 



1 MPSETVGSIV NVGVDESVGF SPPFPSIQHF YRFHRIHRIR LFRPPGPMQL 
51 NRHSHGSGNL GRGVWATVLS DKFPCGQVRI PACAGMTNFE IAVLSGMTVR 
101 VFYCARPAPV NGGRLKMPSE GSDGIGIGES EAVAHAQRGF VGFEAGVFQA 
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151 SPVWAVAGV QGQAGRDVYA HARHRAEAQ A AAAVAFLIGV FLRMSVR INR 
201 NCCVSITRVG GKSTCYFFSR IDAVSDVSVG DARTDIGFEF WEFEIVNGG 
251 QAERRNGVEC AVFLMFRLLV FYVKLVAAKS FIILSFQLFY VHGIFIWPF 

3 01 PVTGI IRGDA PAAEWADRH PGVDGMRTDV SEIIAYRAYF VFAWSGWFRI 
351 IVGNAFGGVG * 

Based on the presence of a several putative transmembrane domains in the gonococcal protein, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 86 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 719>] (SEP ID 
NO: 719) 

1 ATGTTTGCTT TTTTAGAAGC CTTTTTTGTC GAATACGGTT ATGCGGCTGT 

51 TTTTTTTGTA TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAGGATT 

101 TGACCTTGGT AACAGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG 

151 CATATTATGT TTGCAGTCGG TATGCTCGGC GTATTGGTCG GGGACGGCAT 

201 CATGTTCGCC GCCGGACGAA TTTGGGGGCA GArArTCCTA rGGTTCArAC 

251 CTATTGCGsG CATCATGACG CCGrAACGTT ATGAGCAGGT TCAGGAAAAA 

301 TTCGACAAAT ACGGTAACTG GGTCTTATTT GTCGCCCGTT TCCTGCCCGG 

351 TTTGAGAACG GCCGTATTTG TTACAGCCGG TATCAGCCGC AAGGTTTCAT 

4 01 ACTTGCGTTT TATCATTATG GATGGACTGG CCGCA. . . 

This corresponds to the amino acid sequence [<SEQ ID 720; ORF78>] (SEP ID NO: 720; 
PRF78) : 



1 MFAFLEAFFV EYG YAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP 

51 H IMFAVGMLG VLVGDGIM FA AGRIWGQXXL XFXPIAXIMT PXRYEQVQEK 

101 F DKYGNWVLF VARFLPGL RT AVFVTAGISR KVSYLRFIIM DGLAA. . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 72 1>] (SEP ID NP: 721) : 

1 ATGTTTGCTT TTTTAGAAGC CTTTTTTGTC GAATACGGTT ATGCGGCTGT 

51 TTTTTTTGTA TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAGGATT 

101 TGACCTTGGT AACAGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG 

151 CATATTATGT TTGCAGTCGG TATGCTCGGC GTATTGGTCG GGGACGGCAT 

201 CATGTTCGCC GCCGGACGAA TTTGGGGGCA GAAAATCCTA AGGTTCAAAC 

251 CTATTGCGCG CATCATGACG CCGAAACGTT ATGAGCAGGT TCAGGAAAAA 

301 TTCGACAAAT ACGGTAACTG GGTCTTATTT GTCGCCCGTT TCCTGCCCGG 

351 TTTGAGAACG GCCGTATTTG TTACAGCCGG TATCAGCCGC AAGGTTTCAT 

4 01 ACTTGCGTTT TATCATTATG GATGGACTGG CCGCACTGAT TTCCGTCCCT 

451 ATTTGGATTT ATCTGGGCGA ATACGGTGCG CACAACATCG ATTGGCTGAT 

501 GGCGAAAATG CACAGCCTGC AATCGGGTAT TTTTGTTATC TTGGGTATAG 

551 GTGCGACCGT TGTCGCTTGG ATTTGGTGGA AAAAACGCCA ACGTATCCAG 

601 TTTTACCGCA GCAAATTGAA AGAAAAGCGG GCGCAACGCA AAGCCGCCAA 

651 GGCAGCCAAA AAAGCCGCGC AAAGCAAACA ATAA 
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This corresponds to the amino acid sequence [<SEQ ID 722; ORF78-l>] fSEO ID NO: 722; 
PRF78-1) : 

1 MFAFLEAFFV EYG YAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP. 

51 H IMFAVGMLG VLVGDGIM FA AGRIWGQKIL RFKPIARIMT PKRYEQVQEK 

5 101 FDKYGNWVLF VARFLPGLRT AVFVTAGISR KVSYLRFIIM DGLAALISVP 

151 IWIYLGEYGA HNIDWLMAKM HSLQ SGIFVI LGIGATWAW IW WKKRQRIQ 

201 FYRSKLKEKR AQRKAAKAAK KAAQSKQ* 

Computer analysis of this amino acid sequence predicts several transmembrane domains, and also 
gave the following results: 

10 Homology with the dedA homologue of H.influenzae (accession number P45280) (SEP ID NO: 
1158) 

ORF78 (SEP ID NO: 720) and the dedA homologue (SEP ID NO: 1158) show 58% aa identity in 
144aa overlap: 

FLEAFFVE YGYAAVFFVLV I CGFGVP I PEDLTLVTGGV I SGM - - G YTNPH I M FAVGMLGV 61 
15 FL FF EYGY AV FVL+ 1 CGFGVP I PED+TLV+GGVI +G+ N H+M V M+GV 

FLIGFFTEYGYWAVLFVLI ICGFGVPI PEDITLVSGGVIAGLYPENVNSHLMLLVSMIGV 7 9 

LVGDGIMFAAGRIWGQXXLXFXPIAXIMTPXRYEQVQEKFDKYGNWVLFVARFLPGLRTA 121 
L GD M+ GRI+G L F PI I+T R V+EKF +YGN VLFVARFLPGLR 
LAGDSCMYWLGRIYGTKILRFRPIRRIVTLQRLRMVREKFSQYGNRVLFVARFLPGLRAP 13 9 



20 



Orf78: 


4 


DedA: 


20 


Orf78: 


62 


DedA: 


80 


0rf78: 


122 


DedA: 


140 



+++ +GI+R+VSY+RF+++D AA 



Homology with a predicted ORF from N .meningitidis (strain A) 

ORF78 (SEP ID NO: 720) shows 93.8% identity over a 145aa overlap with an ORF (ORF78a) 
25 (SEP ID NP: 724) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 78 . pep MFAFLEAFFVEYG YAAVFFVLVI CGFGVP I PEDLTLVTGGVISGMGYTNPH IMFAVGMLG 

I 1 : 1 1 1 1 I M 1 1 1 1 1 1 1 1 1 1 i I ! I ! 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 1 1 i M I 1 1 

orf 78a MFALLEAFFVEYG YAAVFFVLVI CGFGVP I PEDLTLVTGGVISGMGYTNPH IMFAVGMLG 

30 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 78 . pep VL VGDG I M F AAGR I WGQXXLX FX P I AX I MT PXR YEQ VQE KFDKYGN WVL FVARFL PGLRT 

MIIIIIMI llllll | | Ml I I I I II I I I I I I I I I I I U I I I I I 1 1 1 1 
orf 78a VLVGDGIM FAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGN WVLFVARFLPGLRT 
35 70 80 90 100 110 120 
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orf 78 .pep 
orf 78a 



130 140 
AVFV TAGI SRKVS YLR F I IMDGLAA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 H 1 1 1 1 1 

AVFV TAGISRKVSYLR FLIMDGLAALISVPVWI YLGEYGAHNIDWLiytAKMHSLQ SGIFIA 

160 



130 



140 



150 



170 



180 



The complete length ORF78a nucleotide sequence [<SEQ ID 723>] (SEP ID NO: 723) is: 



10 



15 



20 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 



ATGTTTGCCC 
GTTTTTCGTT 
TGACCTTGGT 
CATATTATGT 
CATGTTCGCC 
CGATTGCGCG 
TTCGACAAAT 
TTTGCGGACT 
ATCTGCGCTT 
GTTTGGATTT 
GGCGAAAATG 
TGGCGGCGGC 
CTTTACCGCG 
GGCAGCGAAA 



TTTTGGAAGC 
TTGGTCATCT 
AACAGGCGGC 
TTGCAGTCGG 
GCCGGACGCA 
CATCATGACG 
ACGGCAACTG 
GCCGTTTTCG 
TCTGATTATG 
ACTTGGGCGA 
CACAGCCTGC 
GCTGGCGTGG 
CACAATTGAG 
AAAGCGGCAC 



CTTTTTTGTC 
GCGGTTTCGG 
GTGATTTCGG 
TATGCTCGGC 
TCTGGGGGCA 
CCGAAACGTT 
GGTGTTATTT 
TTACCGCCGG 
GACGGGCTTG 
GTACGGCGCG 
AATCCGGCAT 
TTCTGGTGGC 
CGAAAAACGC 
AGAAGCAGCA 



GAATACGGCT 
CGTGCCGATT 
GTATGGGTTA 
GTATTGGTCG 
GAAAATCCTC 
ACGCACAGGT 
GTCGCTCGTT 
CATCAGCCGC 
CCGCGCTGAT 
CACAACATCG 
CTTCATCGCA 
GCAAACGCCG 
GCCAAACGCA 
GTAA 



ATGCGGCCGT 
CCCGAGGATT 
TACCAATCCG 
GGGACGGCAT 
AAGTTCAAAC 
TCAGGAAAAA 
TCCTGCCCGG 
AAAGTATCGT 
TTCCGTGCCC 
ATTGGCTGAT 
TTGGGCGTGC 
ACATTATCAG 
AGGCGGAAAA 



This encodes a protein having amino acid sequence [<SEQ ID 724>] (SEP ID NO: 724) : 

1 MFALLEAFFV EYG YAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP 

25 51 H IMFAVGMLG VLVGDGIM FA AGRIWGQKIL KFKPIARIMT PKRYAQVQEK 

101 FDKYGNWVLF VARFLPGLRT AVFVTAGI SR KVSYLRFLIM DGLAALISVP 

151 VWIYLGEYGA HNIDWLMAKM HSLQ SGIFIA LGVLAAALAW FW WRKRRHYQ 

201 LYRAQLSEKR AKRKAEKAAK KAAQKQQ* 

30 ORF78a (SEP ID NO: 724) and ORF78-1 (SEP ID NO: 722) show 89.0% identity in 227 aa 
overlap: 



35 



10 20 30 40 50 60 

orf 78a . pep MFALLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG 

I I :| I I I I I I I I II I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I ' I I I I I I I I I I 
orf 78 - 1 MFAFLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG 

10 20 30 40 50 60 



40 



70 80 90 100 110 ' 120 

orf 78a . pep VLVGDGIMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNWVLFVARFLPGLRT 

llllllllllll I III Nihil Mill 1 1 MM II II MM I III III II II II II II 

orf 78-1 VLVGDGIMFAAGRIWGQKILRFKPIARIMTPKRYEQVQEKFDKYGNWVLFVARFLPGLRT 

70 80 90 100 110 120 



45 



130 140 150 160 170 180 

orf 78a .pep AVFVTAGI SRKVS YLRFLIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQSGIFIA 

I I I I I I I I I I I I I I M I I I I i I I I I I M I I U I I I I I I I I I I I I I I I II I M lh 
orf 78-1 AVFVTAGISRKVSYLRFIIMDGLAALISVPIWIYLGEYGAHNIDWLMAKMHSLQSGIFVI 

130 140 ' 150 160 170 180 



50 



190 200 210 220 

orf 78a . pep LGVLAAALAW FWWRKRRHYQLYRAQLSEKRAKRKAEKAAKKAAQKQQX 

Ih |:::||:||:||:: | : | | : : |: | | | h | || MMMIhMI 
orf 78- 1 LG IGAT WAW I WWKKRQR I QFYRS KLKEKRAQRKAAKAAKKAAQS KQX 
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190 200 210 220 

Homology w i th a predicted ORF from N. gonorrhoeae 

ORF78 (SEP ID NO: 720) shows 97.4% identity over 38 aa overlap with a predicted ORF 
(ORF78ng) (SEP ID NO: 726) from N. gonorrhoeae: 

orf 78 .pep XXLXFXPIAXIMTPXRYEQVQEKFDKYGNWVLFVARFLPGLRTAVFVTAGISRKVSYLRF 137 

. 1 1 1 i 1 1 M i 1 1 1 1 1 1 M I ■ I 1 1 1 1 1 1 i I 

orf78ng YPVLFVARFLPGLRTAVFVTAGI SRKVS YLRF 32 

orf 78. pep IIMDGLAA' 145 
:|IMII1 

orf 78ng LIMDGLAALISVPWIYLGEYGAHNIDWLMAKMHSLQSGIFIALGVLAAALAWFWWRKRR 92 

The PRF78ng nucleotide sequence [<SEQ ID 725>] (SEP ID NO: 725) is predicted to encode a 
protein comprising amino acid sequence [<SEQ ID 726>] (SEP ID NO: 726) : 

1 . . YPVLFVARFL PGLRTAVFVT AGISRKVSYL RFLIMDGLAA LISVPVWIYL 
51 GEYGAHNIDW LMAKMHSLQ S GIFIALGVLA AALAWFW WRK RRHYQLYRAQ 
101 LSEKRAKRKA EKAAKKAAQK QQ* 

Further work revealed the complete gonococcal nucleotide sequence [<SEQ ID 727>] (SEP ID 
NP: 727) : 

1 atgtttgccc tttTggaagc CTTTTTTGTC GAAtacggCt atgcGGCCGT 

51 GTTTTTCGTT TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAAGATT 

101 TGACCTTGGT AACGGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG 

151 CATATTATGT TTGCGGTCGG TATGCTCGGC GTGTTGGCGG GCGACGGCGT 

201 GATGTTTGCC GCCGGACGCA TCTGGGGGCA GAAAATCCTC AAGTTCAAAC 

251 CGATTGCGCG CATCATGACG CCGAAACGTT ACGCGCAGGT TCAGGAAAAA 

301 TTCGACAAAT ACGGCAACTG GGTTCTGTTT GTCGCCCGTT TCCTGCCGGG 

351 TTTGCGGACT GCCGTTTTCG TTACCGCCGG CATCAGCCGC AAAGTATCGT 

401 ATCTGCGCTT TCTGATTATG GACGGGCTGG CCGCGCTGAT TTCCGTGCCC 

451 GTTTGGATTT ACTTGGGCGA GTACGGCGCG CACAACATCG ATTGGCTGAT 

501 GGCGAAAATG CACAGCCTGC AATCGGGCAT CTTCATCGCA TTGGGCGTGC 

551 TGGCGGCGGC GCTGGCGTGG TTCTGGTGGC GCAAACGCCG ACATTATCAG 

601 CTTTACCGCG CACAATTGAG CGAAAAACGC GCCAAACGCA AGGCGGAAAA 

651 GGCAGCGAAA AAAGCGGCAC AGAAGCAGCA GTAa 

This corresponds to the amino acid sequence [<SEQ ID 728; PRF78ng-l>] (SEP ID NP: 728: 
PRF78ng-l) : 



1 MFALLEAFFV EYG YAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP 

51 H IMFAVGMLG VLAGDGVM FA AGRIWGQKIL KFKPIARIMT PKRYAQVQEK 

101 FDKYGNWVLF VARFLPGLRT AVFVTAGISR KVSYLRFLIM DGLAALISVP 

151 VWIYLGEYGA HNIDWLMAKM HSLQ SGIFIA LGVLAAALAW FWWRKRRHYQ 

201 LYRAQLSEKR AKRKAE KAAK KAAQKQQ* 
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ORF78ng-l (SEP ID NO: 728) and ORF78-1 (SEP ID NO: 722) show 88.1% identity in 227 aa 
overlap: 

10 20 30 40 50 60 

orf 78-1 .pep MFAFLEAFFVEYGYAAVFFVLVI CGFGVP I PEDLTLVTGGVI SGMGYTNPHIMFAVGMLG 

1 1 :| I M 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 M 1 1 i Mi M i 1 1 1 1 M 1 1 1 1 II 1 1 1 II 1 1 1 

orf 7'8ng-l MFALLEAFFVEYGYAAVFFVLVI CGFGVP I PEDLTLVTGGVI SGMGYTNPHIMFAVGMLG 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 78 - 1 . pep VLVGDGIMFAAGRIWGQKILRFKPIARIMTPKRYEQVQEKFDKYGNWVLFVARFLPGLRT 

I: I I M I I II I I I I I I I I: I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 
orf 78ng-l VLAGDGVMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNWVLFVARFLPGLRT 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 78-1 .pep AVFVTAGISRKVSYLRFIIMDGLAALISVPIWIYLGEYGAHNIDWLMAKMHSLQSGIFVI 

II II I M I I I II I M MM M I I M I M M I I I M I I I I I I M M I II I II I II II M 
orf 78ng-l AVFVTAGISRKVSYLRFLIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQSGIFIA 

130 140 150 160 170 180 

190 200 210 220 

orf 78-1 .pep LG I GATWAW I WWKKRQR I QFYRS KLKEKRAQRKAAKAAKKAAQS KQX 

||: |::MIMMIM: ;MMMMIIMM Mill MM 
orf 78ng- 1 LGVLAAALAWFWWRKRRHYQLYRAQLSEKRAKRKAEKAAKKAAQKQQX 

190 200 210 220 

Furthermore, orf78ng-l (SEP ID NO: 728) shows homology to the dedA protein (SEP ID NO: 
1 158) from H. influenzae : 

sp | P45280 | YG2 9_HAEIN HYPOTHETICAL PROTEIN HI1629 ) gi | 1073983 | pir | | D64133 dedA 
protein (dedA) homolog - Haemophilus influenzae (strain Rd KW20) 
)gi | 1574476 (U32836) dedA protein (dedA) [Haemophilus influenzae] Length = 212 
Score = 223 bits (563), Expect = 7e-58 

Identities = 108/182 (59%), Positives = 140/182 (76%), Gaps = 2/182 (1%) 

LEAF FVEYGYAAVFFVLV I CGFGVP I PEDLTLVTGGVI SGM - - G YTNPH I M F AVGMLG VL 62 
L FF EYGY AV FVL + 1 CGFGVP I PED+ TLV+GGV I +G+ N H+M V M+GVL 

LIGFFTEYGYWAVLFVL I I CGFGVP I PEDITLVSGGVI AGLYPENVNSHLMLLVSMIGVL 80 

AGDGVMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNWVLFVARFLPGLRTAV 122 
AGD M+ GRI+G KIL+F+PI RI+T +R V+EKF +YGN VLFVARFLPGLR + 
AGDSCMYWLGRI YGTKILRFRPIRRIVTLQRLRMVREKFSQYGNRVLFVARFLPGLRAPI 14 0 

FVTAG I SRKVS YLRFL IMDGLAAL I S VP VW I YLGE YGAHN I DWLMAKMHSLQSG I F I ALG 182 
++ +GI+R+VSY+RF+++D AA+ISVP+WIYLGE GA N+DWL ++ Q I+I +G 



Query : 


5 


Sbjct : 


21 


Query : 


63 


Sbjct: 


81 


Query: 


123 


Sbjct: 


141 


Query : 


183 


Sbjct: 


201 
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Based on this analysis, including the presence of putative transmembrane domains, it is predicted 
that these proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful 
antigens for vaccines or diagnostics, or for raising antibodies. 



Example 87 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 729>] (SEP ID 
NO: 729) : 



1 ATGAAAAAAT TATTGGCGGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT 

51 TTCCGCCGCC GGAGTCCACG TTGAGGACGG CTGGGCGCGC ACCACCGTCG 

101 AAGGTATGAA AATAGGCGGC GCGTTCATGA AAATCCACAA CGACGAAGCC 

151 AAACAAGACT TTTTGCTCGG CGGAAGCAGC CCCGTTGCCG ACCGCGTCGA 

201 AGTGCATACC CACATCAACG ACAACGGCGT GATGCGGATG CGCGAAGTCG 

251 AAGGCGGCGT GCCTTTGGAA GCGAAATCCG TTACCGAACT CAAACCCGGC 

3 01 AGCTATCATG TGATGTTTAT GGGTTTGAAA AAACAATTAA AAGAGGGCGA 

3 51 TAAAATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCG CAAACCGTCC 

4 01 AACTGGAAGT CAAAATCGCG CCGATGCCGG CAATGAACCA C... 

This corresponds to the amino acid sequence [<SEQ ID 730; ORF79>] (SEP ID NO: 730; 
ORF79) : 



1 MKKLLAAVMM AGLAGA VSAA GVHVEDGWAR TTVEGMKIGG AFMKIHNDEA 
51 KQDFLLGGSS PVADRVEVHT HINDNGVMRM REVEGGVPLE AKSVTELKPG 
101 SYHVMFMGLK KQLKEGDKIP VTLKFKNAKA QTVQLEVKIA PMPAMNH . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 73 1>] (SEP IDNP: 731) : 



1 ATGAAAAAAT TATTGGCGGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT 

51 TTCCGCCGCC GGAGTCCACG TTGAGGACGG CTGGGCGCGC ACCACCGTCG 

101 AAGGTATGAA AATAGGCGGC GCGTTCATGA AAATCCACAA CGACGAAGCC 

151 AAACAAGACT TTTTGCTCGG CGGAAGCAGC CCCGTTGCCG ACCGCGTCGA 

201 AGTGCATACC CACATCAACG ACAACGGCGT GATGCGGATG CGCGAAGTCG 

2 51 AAGGCGGCGT GCCTTTGGAA GCGAAATCCG TTACCGAACT CAAACCCGGC 

3 01 AGCTATCATG TGATGTTTAT GGGTTTGAAA AAACAATTAA AAGAGGGCGA 
351 TAAAATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCG CAAACCGTCC 

4 01 AACTGGAAGT CAAAATCGCG CCGATGCCGG CAATGAACCA CGGTCATCAC 
4 51 CACGGCGAAG CGCATCAGCA CTAA 



This corresponds to the amino acid sequence [<SEQ ID 732; PRF79-1>] (SEP ID NO: 732; 
ORF79-n : 



1 MKKLLAAVMM AGLAGAV SAA GVHVEDGWAR TTVEGMKIGG AFMKIHNDEA 

51 KQDFLLGGSS PVADRVEVHT HINDNGVMRM REVEGGVPLE AKSVTELKPG 

101 SYHVMFMGLK KQLKEGDKIP VTLKFKNAKA QTVQLEVKIA PMPAMNHGHH 

151 HGEAHQH* 
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Computer analysis of this amino acid sequence revealed a putative leader peptide and also gave the 
following results: 

Homology with a predicted ORF from N. meningitidis (strain A) 

ORF79 (SEP ID NO: 730) shows 94.6% identity over a 147aa overlap with an ORF (ORF79a) 
(SEP ID NO: 734) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 79 . pep MKKLLAAVMMAGLAGAV SA 

II I I I I I I I I I II I I I ll I: I I I , I I I I I I M I h I I I I I I I I I I M I I I I I I I I I 
orf 79a MKXLLAAVMMAGLAGA VSAA^ 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 79 . pep PVADRVEVHTH INDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKI P 

MIMIIIIM IIIIIMIMII IMIIIIIIIIII IMIIMI 1 1 1 1 1 Mill 

orf 79a PVADRVEVHTH I NDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGXKKQLKXGDK I P 

70 80 90 100 110 120 

130 140 
or f 7 9 . pep VTLKFKNAKAQTVQLEVKIAPMPAMNH 

I I I I I I I I I I I I I I I I Ml Ihl 
or f 7 9 a VTLKFKNAKAQTVQLE VKTAPMS AMDHGHHHGEAHQHX 

130 140 150 

The complete length PRF79a nucleotide sequence [<SEQ ID 733>] (SEP ID NP: 733) is: 

1 ATGAAANAAC TATTGGCAGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT 

51 TTCCGCCGCC GGAATCCACG TTGAGGACGG CTGGGCGCGC ACCACCGTCG 

101 AAGGTATGAA AATGGGCGGC GCGTTCATGA AAATCCACAA CGACGAAGCC 

151 AAACAAGACT TTTTGCTCGG CGGAAGCAGC CCTGTTGCCG ACCGCGTCGA 

201 AGTGCATACC CATATCAATG ATAACGGTGT GATGCGGATG CGCGAAGTCG 

251 AAGGCGGCGT GCCTTTGGAG GCGAAATCCG TTACCGAACT CAAACCCGGC 

301 AGCTATCATG TCATGTTTAT GGGTNTGAAA AAACAATTAA AAGANGGCGA 

3 51 CAAGATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCA CAAACCGTCC 

4 01 AACTGGAAGT CAAAACCGCG CCGATGTCGG CAATGGACCA CGGTCATCAC 
451 CACGGCGAAG CGCATCAGCA CTAA 

This encodes a protein having amino acid sequence [<SEQ ID 734>] (SEP ID NP: 734) : 



1 MKXLLAAVMM AGLAGA VSAA GIHVEDGWAR TTVEGMKMGG AFMKIHNDEA 
51 KQDFLLGGSS PVADRVEVHT HINDNGVMRM REVEGGVPLE AKSVTELKPG 
101 SYHVMFMGXK KQLKXGDKIP VTLKFKNAKA QTVQLEVKTA PMSAMDHGHH 
151 HGEAHQH* 

PRF79a (SEP ID NP: 734) and PRF79-1 (SEP ID NP: 732) show 94.9% identity in 157 aa 



overlap: 



CHIR-0160 (356.001) 



-536- 



PATENT 



10 20 30 40 50 60 

orf 79a . pep MKXLLAAVMMAGLAGAVSAAGIHVEDGWARTTVEGMKMGGAFMKIHNDEAKQDFLL^ 

II I I I I I I I M I I I I I M I I : I I I I I I I I I I I I I I h I I I I I I I i I I I I I I I I M M I ' 
orf 79-1 MKKLLAAVMMAGLAGAVS AAGVHVEDGWARTTVEGMKIGGAFMKIHNDEAKQDFLLGGSS 

5 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 7 9a . pep PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGS YHVMFMGXKKQLKXGDKI P 

1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill Mill 

orf 79-1 . PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGS YHVMFMGLKKQLKEGDKI P 
10 70 80 90 100 110 120 

130 140 150 

orf 79a. pep VTLKFKNAKAQTVQLEVKTAPMSAMDHGHHHGEAHQHX 

I I I I I I M I I I I I I I M Ml 1 I M I I I I I I I 11 I I 
orf 79-1 VTLKFKNAKAQTVQLEVKI APMPAMNHGHHHGEAHQHX 

15 130 140 150 

Homology with a predicted ORF from N. gonorrhoeae 

ORF79 (SEP ID NO: 730) shows 96.1% identity over 76 aa overlap with a predicted ORF 
(ORF79ng) (SEP ID NO: 736) from N. gonorrhoeae: 

orf 79 . pep FMKIHNDEAKQDFLLGGSSPVADRVEWTHINDNGVMRMREVEGGVPLEAKSVTELKPGS 101 

20 | | | | || | || I I h I II I I II I I I I I I I I I I 

orf 7 9ng INDNGVMRMREVKGGVPLEAKSVTELKPGS 3 0 

orf 79 . pep YHVMFMGLKKQLKEGDKI PVTLKFKNAKAQTVQLEVK I APMPAMNH 14 7 

I I I I I I I I I I I I I I I I I I I I II I I Ml I I I I I I I Ml MM 
orf 7 9ng YHVMFMGLKKQLKEGDKI PVTLKFKNAKAQTVQLEVKTAPMS AMNHGHHHGEAHQH 8 6 

An ORF79ng nucleotide sequence [<SEQ ID 735>] (SEP ID NO: 735) was predicted to encode a 
protein comprising amino acid sequence [<SEQ ID 736>] (SEP ID NP: 736) : 

1 . . INDNGVMRMR EVKGGVPLEA KSVTELKPGS YHVMFMGLKK. QLKEGDKIPV 
51 TLKFKNAKAQ TVQLEVKTAP MSAMNHGHHH GEAHQH* 

Further work revealed the complete gonococcal DNA sequence [<SEQ ID 737>] (SEP ID NO: 
737) : 



25 



30 



1 ATGAAAAAAT TATTGGCAGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT 

51 TTccgccgCc GGagTccAtG TCGAggACGG CTGGGCGCGc accaCTGtcg 

35 101 aaggtATgaa aatggGCGGC GCgttCATga aaATCCACAA CGACGaaGcc 

151 atacaaGACt ttgtgcTCgg CGGaagcatg cccgttgccg accgcGTCGA 

201 AGTGCAtaca cacATCAACG ACAACGGCGT GATGCGTATG CGCGAAGTCA 

251 AAGGCGGCGT GCCTTTGGAG GCGAAATCCG TTACCGAACT CAAACCCGGC 

301 AGCTATCACG TGATGTTTAT GGGTTTGAAA AAACAACTGA AAGAGGGCGA 

40 .351 CAAGATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCG CAAACCGTCC 

401 AACTGGAAGT CAAAACCGCG CCGATGTCGG CAATGAACCA CGGTCATCAC 

451 CACGGCGAAG CGCATCAGCA CTAA 
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This corresponds to the amino acid sequence [<SEQ ID 738; ORF79ng-l>] (SEP ID NO: 738; 
PRF79ng-l) : 

1 MKKLLAAVMM AGLAGA VSAA GVHVEDGWAR TTVEGMKMGG AFMKIHNDEA 
51 IQDFVLGGSM PVADRVEVHT HINDNGVMRM REVKGGVPLE AKSVTELKPG 
5 101 SYHVMFMGLK KQLKEGDKIP VTLKFKNAKA QTVQLEVKTA PMSAMNHGHH 

151 HGEAHQH* . ^ 

ORF79ng-l (SEP ID NO: 738) and ORF79-1 (SEP ID NO: 732) show 95.5% identity in 157 aa 
overlap: 

10 10 20 30 40 50 60 

orf 79-1. pep MKKLLAAVMMAGIAGAVSAAGVHVED 

1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 ! 1 1 M Ihl 1 1 M 1 1 1 ; 1 1 1 llhllll 

orf 79ng- 1 MKKLLAAVMMAGLAGAVSAAGVHVEDGWAOT 

10 20 30 40 50 60 

15 70 80 90 100 110 120 

orf 7 9-1. pep PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGS YHVMFMGLKKQLKEGDKI P 
I I I I I I I I I I I ' I I I I I I : I I I hi I M I I I I < I I I I I I I I I I I I I I I I I I I I I M I I I 
orf 79ng-l PVADRVEVHTHINDNGVMRMREVKGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIP 

70 80 90 100 110 120 

20 130 140 150 

or f 7 9 - 1 . pep VTLKFKNAKAQTVQLEVKI APMPAMNHGHHHGEAHQHX 

IIIIIIIIIIMIIIMI III lllllllllllllll 
orf 7 9ng- 1 VTLKFKNAKAQTVQLEVKTAPMSAMNHGHHHGEAHQHX 

130 140 150 

25 

Furthermore, PRF79ng-l (SEP ID NP: 738) shows significant homology to a protein (SEP ID 
NP: 1 159) from Aquifex aeolicus: 

gi 1 2983695 (AEOOo'731) putative protein [Aquifex aeolicus] Length = 151 
Score = 63.6 bits (152), Expect = 6e-10 
30 Identities = 38/114 (33%)., Positives = 58/114 (50%), Gaps = 1/114 (0%) 

Query: 24 VEDGWARTTVEGMKMGGAFMKIHNDEAIQDFVLGGSMPVADRVEVHTHINDNGVMRMREV 83 . 

V+ W G M I N+ D+++G +A RVE+H + +N V +M 

Sbjct: 27 VKHP WVME P P PG PNTTMMGM 1 1 VNEGDE PD YL I GAKTD I AQRVELH KT V I ENDVAKMVPQ 86 

Query: 84 KGGVPLE AKSVTELKPGS YHVMFMGLKKQLKEGDKI PVTLKFKNAKAQTVQLEV 137 
35 + + + K E K YHVM +GLKK+ + KEGDK+ V L F+ + TV+ V 

Sbjct: 87 ER - 1 E I P PKGKVEFKHHGYHVM 1 1 GLKKR I KEGDKVKVEL I FEKSGKI TVEAP V 139 

Based on this analysis, it is predicted that the proteins from ^meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



40 



PRF79-1 (SEP ID NP: 7321 (15.6kDa) was cloned in the pET vector and expressed in ExolU as 
described above. The products of protein expression and purification were analyzed by SDS- 
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PAGE. Figure 18A shows the results of affinity purification of the His-fusion protein. Purified 
His-fusion protein was used to immunise mice, whose sera were used for ELISA (positive result) 
and FACS analysis (Figure 18B) These experiments confirm that ORF79-1 (SEP ID NQ: 732) is a 
surface-exposed protein, and that it is a useful immunogen. 

Example 88 

The following DNA sequence, believed to be complete, was identified in ^meningitidis [<SEQ ID 
739>] (SEP ID NO: 739) : 

1 ATGACGGTAA CTGCGGCCGA AGGCGGCAAA GCTGCCAAGG CGTTAAAAAA 

51 ATATCTGATT ACGGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT 

101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ATCAGCTCGT CAACCTGCTG 

151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCGGGGCT 

2 01 GGGCGTTATC GTTGCCATTG CCGTATTGTT TGTAACCGGA TTGTTTGCCG 

2 51 CCAACGTATT GGGTCGGCAG ATCCTCGCCG CGTGGGACAG CCTGTTGGGG 

3 01 CGGATTCCGG TTGTGAAAtC CATCTATTCG AGTGTGAAAA AAGTATCCGA 

3 51 ATacgTGCTG TCCGACAGCA GCCGTTCGTT TAAAACGCCG GTACTCGTGC 

4 01 CGTTTCCCCA GCCCGGTATT TGGACGATyG CTTTCGTGTC AGGGCAGGTG 
4 51 TCGAATGCGG TTAAGGCCGC ATTGCCGAAs GACGGCGATT ATCTTTCCGT 
501 GTATGTTCCG ACCACGCCGA ATCCGACCGG CGGTTACTAT ATTATGGTAA 
551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AsCATTGAAA 
601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC 
651 ATTGGCAsGA CCTATGCCGT CTGAAAAGGC GGATTTGCCC GAACAACAAT . 
701 AA 

This corresponds to the amino acid sequence [<SEQ ID 740; PRF98>] (SEP ID NP: 740; 
PRF98) : 

1 MTVTAAEGGK AAKALKKYLI TGILVWLPIA VTVWWSYIV SASDQLVNLL 

51 PKQWRPQYVL GFNIPGLGVI VAIAVLFVTG LFAANVLGRQ ILAAWDSLLG 

101 RIPWKSIYS SVKKVSEYVL SDSSRSFKTP VLVPFPQPGI WTIAFVSGQV 

151 SNAVKAALPX DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEXLK 

201 YVISLGMVIP DDLPVKTLAX PMPSEKADLP EQQ* 

Further work revealed the complete nucleotide sequence [<SEQ ID 74 1>] (SEP ID NP: 741) : 

1 ATGACGGAAC nTGCGGCCGA AGGCGGCAAA GCTGCCAArG CGTTAAAAAA 

51 ATATCTGATT ACGGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT 

101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ATCAGCTCGT CAACCTGCTG 

151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCGGGGCT 

201 GGGCGTTATC GTTGCCATTG CCGTATTGTT TGTAACCGGA TTGTTTGCCG 

251 CCAACGTATT GGGTCGGCAG ATCCTCGCCG CGTGGGACAG CCTGTTGGGG 

301 CGGATTCCGG TTGTGAAATC CATCTATTCG AGTGTGAAAA AAGTATCCGA 

351 ATCGCTGCTG TCCGACAGCA GCCGTTCGTT TAAAACGCCG GTACTCGTGC 

4 01 CGTTTCCCCA GCCCGGTATT TGGACGATTG CTTTCGTGTC AGGGCAGGTG 

451 TCGAATGCGG TTAAGGCCGC ATTGCCGAAG GACGGCGATT ATCTTTCCGT 

501 GTATGTTCCG ACCACGCCGA ATCCGACCGG CGGTTACTAT ATTATGGTAA 

551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AGCATTGAAA 

601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC 
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651 ATTGGCAGGA CCTATGCCGT CTGAAAAGGC GGATTTGCCC GAACAACAAT 

701 AA 

This corresponds to the amino acid sequence [<SEQ ID 742; ORF98-l>] (SEP ID NO: 742; 
ORF98-1) : 

1 MTEXAAEGGK AAKALKKYL I TGILVWLPIA VTVWW SYIV SASDQLVNLL 

51 PKQWRPQYVL GFNIPG LGVI VAIAVLFVTG LFAA NVLGRQ ILAAWDSLLG 

101 RIPWKSIYS SVKKVSESLL SDSSRSFKTP VLVPFPQPGI WTIAFVSGQV 

151 SNAVKAALPK DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK 

201 YVISLGMVIP DDLPVKTLAG PMPSEKADLP EQQ* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF98 (SEP ID NO: 740) shows 96.1% identity over a 233aa overlap with an ORF (ORF98a) 
(SEP ID NO: 744) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 98 . pep MTVTAAEGGKAAKALKKYLITGILVWLPIAVTVWWSYIVSASDQLVNLLPKQWRPQYVL 

II IIMIIMiri I MIIMIIMIIIIIIIIIIMIIIIIIMMII I III 

orf 98a MTEPAAEGGKAAKALKKYLITGILVWLPIAVTVWWSYIVSASDQLVNLLPKQWRPQYVL 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 98 .pep GFN I PGLG V I VA I A VL F VTGL FAANVLGRQ I LAAWDS LLGR I PWKS I YS S VKKVS E YVL 

M M 1 1 1 1 M i 1 1 1 1 i I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 ! 1 1 1 1 1 I I 

orf 98a GFN I PGLGVI VA I AVLFVTGL FAANVLGRQ I LAAWDS LLGR I PWKS IYSSVKKVSXSLL 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 98 . pep SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPXDGDYLSVYVPTTPNPTGGYY 

I M I ! 1 1 1 1 M 1 1 1 1 MIIIIIIIIIIIIIIIIMI IIIIIIIIIIIIIIIIIMI 

orf 98a SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY 

130 140 150 160 170 180 

190 200 210 220 230 

orf 98 . pep IMVKKSDVRELDMSVDEXLKYVISLGMVIPDDLPVKTLAXPMPSEKADLPEQQX 

1 1 1 1 1 1 1 1 1 1 1 1 1 M E 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 llllllllllllll 

orf 98a IMVKKSDVRELDMSVDEALKYVI SLGMVI PDDLPVKTLAGPMPSEKADLPEQQX 

190 200 210 220 230 

The complete length PRF98a nucleotide sequence [<SEQ ID 743>] fSEP ID NP: 743) is: 



1 ATGACGGAAC CTGCGGCCGA AGGCGGCAAA GCTGCCAAGG CGTTAAAAAA 

51 ATATCTGATT ACGGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT 

101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ATCAGCTCGT CAACCTGCTG 

151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCGGGGCT 

201 GGGCGTTATC GTTGCCATTG CCGTATTGTT TGTAACCGGA TTATTTGCCG 

251 CAAACGTATT GGGCCGGCAG ATTCTTGCCG CGTGGGACAG CTTGTTGGGG 

301 CGGATTCCGG TTGTGAAGTC CATCTATTCG AGTGTGAAAA AAGTATCCGA 
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351 NTCGTTGCTG TCCGACAGCA GCCGTTCGTT TAAAACACCA GTACTCGTGC 

4 01 CGTTTCCCCA ATCGGGTATT TGGACAATCG CATTCGTGTC CGGTCAGGTG 

451 TCGAATGCGG TTAAGGCCGC ATTGCCGAAG GACGGCGATT ATCTTTCCGT 

501 GTATGTTCCG ACCACGCCGA ATCCGACCGG CGGTTACTAT ATTATGGTAA 

5 551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AGCGTTGAAA 

601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC 

651 ATTGGCAGGA CCTATGCCGT CTGAAAAGGC GGATTTGCCC GAACAACAAT 

701 AA 

10 This encodes a protein having amino acid sequence [<SEQ ID 744>1 (SEP ID NO: 744) : 

1 MTEPAAEGGK AAKALKKYL I TGILVWLPIA VTVWW SYIV SASDQLVNLL 

51 PKQWRPQYVL GFNIPG LGVI VAIAVLFVTG LFAA NVLGRQ ILAAWDSLLG 

101 RIPWKSIYS SVKKVSXSLL SDSSRSFKTP VLVPFPQSGI WTIAFVSGQV 

151 SNAVKAALPK DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK 

15 201 YVISLGMVIP DDLPVKTLAG PMPSEKADLP EQQ* 

ORF98a (SEP ID NO: 744) and ORF98-1 (SEP ID NO: 742) show 98.7% identity in 233 aa 
overlap: 

10 20 30 40 50 60 

20 orf 98a . pep MTEPAAEGGKAAKALKKYLITGILVWLPIAVTVWWSYIVSASDQLVNLLPKQWRPQYVL 

III I III 1 1 1 1 1 1 1 1 1 1 1 1 II II II II II III I M Ml 1 1 1 II I III M III 1 1 1 

or f 9 8 - 1 MTEXAAEGGKAAKALKKYL I TG I LVWL P I AVTVWWS Y I VS ASDQLVNLLP KQWRPQYVL 

10 20 30 40 50 60 

70 80 90 100 110 120 

25 orf 98a . pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPWKSIYSSVKKVSXSLL 

1 1 1 1 1 1 I II I M I M I II I I II II II Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 III 

orf 98-1 GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPWKSIYSSVKKVSESLL 

70 80 90 100 110 120 

130 140 150 160 170 180 

30 orf 98a . pep SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY 

MIMMIMM MM MIIMI IMIIIIIIIIMIIIIIMIIIM IIMIMI 

orf 98-1 SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY 

130 140 150 160 170 180 

190 200 210 220 230 

35 orf 98a. pep IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX 

II II 1 1 1 1 1 1 II M 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II II 1 1 1! 1 1 I 

orf 98-1 IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX 

190 200 210 220 230 

Homology with a predicted PRF from N. gonorrhoeae 

40 PRF98 (SEP ID NO: 740) shows 95.3% identity over a 233 aa overlap with a predicted PRF 
(PRF98ng) (SEP ID NP: 746) from N. gonorrhoeae: 

10 20 30 40 50 60 

' orf 98. pep MTVTAAEGGKAAKALKKYL I TG I LVWLP I AVTVWWS Y I VSASDQLVNLL PKQWRPQYVL 60 

II Ml 1 1 1 1 I II 1 1 1 1 1 1 M Ml II II 1 1 M II 1 1 II II II II II Ml 1 1 II II II 

45 orf 98ng MTE PAAEGGKAA KALKKYL I TGI LVWLP I AVTVWWS Y I VSASDQLVNLL PKQWRPQYVL 60 
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orf98.pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPWKSIYSSVKKVSEYVL 120 

I I I i 1 I I 1 [ I I I I E I 1 I I I I I I t I I I I I I I I I I I 1 I I ( I IIIIIIIIIIIIIIIM :| 

orf 98ng GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLXRIPWKSIYSSVKKVSESLL 120 

orf 98 .pep SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPXDGDYLSVYVPTTPNPTGGYY 180 

1 1 1 1 1 1 1 1 1 .] 1 1 1 1 1 MINIM lllllllllll MMMMMMIMMM 

orf 98ng SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPQDGDYLSVYVPTTPNPTGGYY 180 

• orf 98 .pep IMVKKSDVRELDMSVDEXLKYVISLGMVIPDDLPVKTLAXPMPSEKADLPEQQ 233 

I M 1 1 1 i I It I It 1 1 1 1 M M I M 1 1 M M I M M 1 1 1 Ml I h 

orf 9 8ng IMVKKSDVRELDMSVDEALKYVISLGMVI PDDLPVKTLAGPMPPEKAELPEQQ 233 

The complete length ORF98ng nucleotide sequence [<SEQ ID 745>] (SEP ID NO: 745) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 746>] (SEP ID NO: 746) : 



1 MTEPAAEGGK AAKALKKYL I TGILVWLPIA VTVWW SYIV SASDQLVNLL 

51 PKQWRPQYVL GFNIPG LGVI VAIAVLFVTG LFAA NVLGRQ ILAAWDSLLX 

101 RIPWKSIYS SVKKVSESLL SDSSRSFKTP VLVPFPQSGI WTIAFVSGQV 

151 SNAVKAALPQ DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK 

201 YVISLGMVIP DDLPVKTLAG PMPPEKAELP EQQ* 

Further work revealed the complete nucleotide sequence [<SEQ ID 747>] (SEP ID NP: 747) : 



1 ATGACGGAAC CTGCGGCCGA AGGCGGCAAA GCTGCCAAGG CGTTAAAAAA 

51 ATATCTGATT ACAGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT 

101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ACCAGCTTGT CAACCTGCTG 

151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCCGGGCT 

2 01 CGGCGTTATT GTTGCCATTG CCGTATTGTT TGTAACCGGA TTATTTGCCG 
251 CAAACGTGTT GGGCCGGCAG ATTCTTGCCG CGTGGGACAG CCTGTTgggg 

3 01 cggaTTCCGG TTGTCAAATC CATCTATTCG AGTGTGAAAA AAGTATCCGA 
351 ATCGCTGCTG TCCGACAGCA GCCGTTCGTT TAAAACGCCG GTACTCGTGC 

4 01 CGTTTCCCCA ATCGGGTATT TGGACAATCG CATTCGTGTC CGGTCAGGTG 
4 51 TCGAATGCGG TTAAGGCCGC ATTGCCGCAG GATGGCGATT ATCTTTCCGT 
501 GTATGTCCCG ACCACGCCCA ACCCGACCGG CGGTTACTAT ATTATGGTAA 
551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AGCGTTGAAA 
601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC 
651 ATTGGCAGGA CCTATGCCGC CTGAAAAGGC GGAGTTGCCC GAACAACAAT 
701 AA 



This corresponds to the amino acid sequence [<SEQ ID 748; PRF98ng-l>] (SEP ID NP: 748; 
PRF98ng-l) : 



1 MTEPAAEGGK AAKALKKYL I TGILVWLPIA VTVWW SYIV SASDQLVNLL 

51 PKQWRPQYVL GFNIPG LGVI VAIAVLFVTG LFAA NVLGRQ ILAAWDSLLG 

101 RIPWKSIYS SVKKVSESLL SDSSRSFKTP VLVPFPQSGI WTIAFVSGQV 

151 SNAVKAALPQ DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK 

201 YVISLGMVIP DDLPVKTLAG PMPPEKAELP EQQ* 



PRF98ng-l (SEP ID NP: 748) and PRF98-1 (SEP ID NP: 742) show 97.9% identity in 233 aa 
overlap: 



10 20 30 40 50 60 

orf 98-1 .pep MTEXAAEGGKAAKALKKYLITGILVWLPIAVTVWWSYIVSASDQLVNLLPKQWRPQYVL 
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III II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M Ml 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 

orf 98ng-l MTEP AAEGGKAAKALKKYL I TG I LWLP I AVTVWVVS Y I VS ASDQLVNLLPKQWRPQYVL 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 98-1 .pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPWKSIYSSVKKVSESLL 

' I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I ! I I I I • I M I i I I I I I I I I I i I 
orf 98ng-l GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPWKSIYSSVKKVSESLL 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 98 - 1 . pep SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II llllllllllllllll MIMIII IMIIIMIIII 

orf 98ng-l SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPQDGDYLSVYVPTTPNPTGGYY 

130 140 150 160 170 180 

190 200 210 220 230 

orf 98-1 .pep IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX 

MMMMMMM MM MMMMMMMMMM MMMMM 

orf 98ng-l IMVKKSDVRELDMSVDEALKYVI SLGMVI PDDLPVKTLAGPMP PEKAELPEQQX 

190 200 210 220 230 

Based on this analysis, including the fact that the putative transmembrane domains in the 
gonococcal protein are identical to the sequences in the meningococcal protein, it is predicted that 
the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens 
for vaccines or diagnostics, or for raising antibodies. 

Example 89 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 749>] (SEP ID 
NO: 749) : 



1 ATgAAAACGG TAGTCTGGAT TGTCGTCCTG TTTGCCGCCG CCGTCGGACT 

51 GGCGCTGGCT TCGGGCATTT ACACCGGCGA CGTGTATATC GTACTCGGAC 

101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT 

151 GCCGTCGTGG TGTGGTATTT CTTGTTTAAA TTCATTATCG G||GgTACTCA 

201 ATATCCCCGA AAAGATGCAG CGTTTCGGTT CGGCnCGTAA AGGCCkCAAG 

2 51 ssCGsGCTTG CCTTGAACAA GGCGGGTTTG GCGTATTTTG AAGGGCGTTT 
301 TGAAAAGGCG GAACTAGAAG CCTCACGCGT GTTGGTCAAC AAAGtAGGCC 

3 51 GaGAGACAAC CGGACTTTGG CATTGATGCT GrGCGCGCAC GCCGCCGGAC 

4 01 AGATGGAAAA CATCGAssTG CGCGACCGTT ATCTTGCGGA AATCGCCAAA 
4 51 CTGCCGGAAA AACAGCAGCT TTCCCGTTAT CTTTTGTTGG CGGAATCGGC 
501 GTTGAACCGG CGCGATTACG AAGCGGCGGA AGCCAATCTT CATGCGGCGG 
551 CGAAGATGAA TGCCAACCTT ACGCGCCTCG TGCGTCTGCA . ATTCGTTAC 
601 GCTTTCGACA GGGGCGACGC GTTGCAGGTT CTGGCAAAAA CCGAAAAACT 
651 TTCCAAGGCG GGCGCGTTGG GCAAATCGGA AATGGAACGG TATCAAAATT 
701 GGGCATATCC GTCGCCAGCT GGCGGATGCT GCCGATGCCG CCGCTTTGAA 
751 AACCTGCCTG AAGCGGATTC CCGACAGCCT CAAAAACGGG GAATTGAGCG 
801 TATCGGTTGC GGAAAAGTAC GAACGTTTGG GACTGTATGC CGATGCGGTC 
851 AAATGGGTCA AACAGCATTA TCCGCAsAAC CGCCGCCCCG AGCTTTTGGA 
901 AGCCTTTGTC GAAAGCGTGC GCTTTTTGGG CGAGCGCGAA CAGCAGAAAG 
951 CCATCGATTT TGCCGATGCT TGGCTGAAAG AACAGCCCGA TAACGCGCTT 

1001 CTGCTGATGT ATCTCGGTCG GCTCGCCTTC GGCCGCAAAC TTTGGGGCAA 
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1051 GGCAAAAGGC TACCTTGAAG CGAGCATTGC ATTAAAGCCG AGTATTTCCG 
1101 CGCGTTTGGT TCTAACAAAG GTTTTCGACG AAATCGGAGA ACCGCAGAAG 
1151 GCGGAGGCGC AC. . . 

This corresponds to the amino acid sequence [<SEQ ID 750; ORF100>] fSEO ID NO: 750; 
ORF100) : 



1 MKTWWIWL FAAAVGLALA SGIYTGDVYI VLGQTMLRIN LHAFVLGSLI 

51 AVWWYFLFK FIIGVLNIPE KMQRFGSARK GXKXXLALNK AGLAYFEGRF 

101 EKAELEASRV LVNKVGRDNR TLALMLXAHA AGQMENIXXR DRYLAEIAKL 

151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLXIRYA 

201 FDRGDALQVL AKTEKLSKAG ALGKSEMERY QNWAYRRQLA DAADAAALKT 

251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP XNRRPELLEA 

301 FVESVRFLGE REQQKAIDFA DAWLKEQPDN ALLLMYLGRL AFGRKLWGKA 

351 KGYLEAS I AL KPSISARLVL TKVFDEIGEP QKAEAH. . . 



Further work revealed the complete nucleotide sequence [<SEQ ID 75 1>] (SEP ID NO: 751) : 



, 1 ATGAAAACGG TAGTCTGGAT TGTCGTCCTG TTTGCCGCCG CCGTCGGACT 

51 GGCGCTGGCT TCGGGCATTT ACACCGGCGA CGTGTATATC GTACTCGGAC 

101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT 

151 GCCGTCGTGG TGTGGTATTT CTTGTTTAAA TTCATTATCG GCGTACTCAA 

201 TATCCCCGAA AAGATGCAGC GTTTCGGTTC GGCGCGTAAA GGCCGCAAGG 

251 CCGCGCTTGC CTTGAACAAG GCGGGTTTGG CGTATTTTGA AGGGCGTTTT 

3 01 GAAAAGGCGG AACTAGAAGG CTCACGCGTG TTGGTCAACA AAGAGGCCGG 
351 AGACAACCGG ACTTTGGCAT TGATGCTGGG CGCGCACGCC GCCGGACAGA 
401 TGGAAAACAT CGAGCTGCGC GACCGTTATC TTGCGGAAAT CGCCAAACTG 

4 51 CCGGAAAAAC AGCAGCTTTC CCGTTATCTT TTGTTGGCGG AATCGGCGTT 
501 GAACCGGCGC GATTACGAAG CGGCGGAAGC CAATCTTCAT GCGGCGGCGA 
551 AGATGAATGC CAACCTTACG CGCCTCGTGC GTCTGCAACT TCGTTACGCT 
601 TTCGACAGGG GCGACGCGTT GCAGGTTCTG GCAAAAACCG AAAAACTTTC 
651 CAAGGCGGGC GCGTTGGGCA AATCGGAAAT GGAACGGTAT CAAAATTGGG 
701 CATACCGCCG CCAGCTGGCG GATGCTGCCG ATGCCGCCGC TTTGAAAACC 
751 TGCCTGAAGC GGATT.CCCGA CAGCCTCAAA AACGGGGAAT TGAGCGTATC 
801 GGTTGCGGAA AAGTACGAAC GTTTGGGACT GTATGCCGAT GCGGTCAAAT 
851 GGGTCAAACA GCATTATCCG CACAACCGCC GCCCCGAGCT TTTGGAAGCC 
901 TTTGTCGAAA GCGTGCGCTT TTTGGGCGAG CGCGAACAGC AGAAAGCCAT 
951 CGATTTTGCC GATGCTTGGC TGAAAGAACA GCCCGATAAC GCGCTTCTGC 

1001 TGATGTATCT CGGTCGGCTC GCCTACGGCC GCAAACTTTG GGGCAAGGCA 

1051 AAAGGCTACC TTGAAGCGAG CATTGCATTA AAGCCGAGTA TTTCCGCGCG 

1101 TTTGGTTCTA GCAAAGGTTT TCGACGAAAT CGGAGAACCG CAGAAGGCGG 

1151 AGGCGCAGCG CAACTTGGTT TTGGAAGCCG TCTCCGATGA CGAACGTCAC 
1201 ■ GCAGCGTTAG AGCAGCATAG CTGA 

This corresponds to the amino acid sequence [<SEQ ID 752; ORF100-1>] (SEP ID NO: 752; 
ORF100-1) : 



1 MKTWWIWL FAAAVGLALA SGIYTGDVYI VLGQTMLRIN LHAFVLGSLI 

51 AVWWYFLFK FIIGV LNIPE KMQRFGSARK GRKAALALNK AGLAYFEGRF 

101 EKAELEASRV LVNKEAGDNR TLALMLGAHA AGQMENIELR DRYLAEIAKL 

151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLQLRYA 

201 FDRGDALQVL AKTEKLSKAG ALGKSEMERY QNWAYRRQLA DAADAAALKT 

251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP HNRRPELLEA 

301 FVESVRFLGE REQQKAIDFA DAWLKEQPDN ALLLMYLGRL AYGRKLWGKA 

351 KGYLEAS I AL KPSISARLVL AKVFDEIGEP QKAEAQRNLV LEAVSDDERH 
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4 01 AALEQHS* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N. meningitidis (strain A) 



5 ORF100 (SEP ID NO: 750) shows 93.5% identity over a 386aa overlap with an ORF (ORFlOOa) 
(SEP ID NO: 754) from strain A of N. meningitidis: 



10 20 30 40 50 60 

MKTWWIWLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAWVWYFLFK 

MINIMI II MUM 1 1 1 M M I II M M 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 

MKTWWIWLFAAAXGLALASGIXTGDVYIVLGQTMLRINLHAFVLGSLIAVWWYFLFK 
10 20 30 40 50 60 

70 80 90 100 110 120 

FIIGVLNIPEKMQRFGSARKGXKXXLALNKAGLAYFEGRFEKAELEASRVLVNKVGRDNR 

Mill MIMIIIII I 1 1 1! M I M II 1 1 1 1 1 II 1 1 M M M II : III 

FIIGVLNXPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR 
70 80 90 100 110 120 



orf 100 .pep 
10 orflOOa 

orf 100 .pep 
15 orflOOa 



130 140 150 160 170 180 

orf 100 .pep TLALMLXAHAAGQMENIXXRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 

II I I I II I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 
20 orflOOa TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 100. pep AAAKMNANLTRLVRLXIRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQLA 

MMIM llllll M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 Mill IIMIIMI Mill 

25 orflOOa AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKXSKAGAXGKSEMERYQNWAYRRQLX 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 100 . pep DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPXNRRPELLEA 

II III II 1 1 III MM II III II I II III MIMIIIII III llllll II I MMIM 

30 OrflOOa DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA 

250 260 270 280 290 300 



310 320 330 340 350 360 

orf 100 . pep FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAFGRKLWGKAKGYLEASIAL 

I I I II II I II M II I II II I I I I II I M I I I I II I I I I I I : I I i I I I I I I I I I I I I I 1 1 
35 orf 100a FVESVRFLGERDQQKAIDFADAWLKEQPDNALLLXYLGRLAYGRKLWGKAKGYLEAS I AL 

310 320 330 340 350 360 



370 380 
orf 100 .pep KPS ISARLVLTKVFDEIGEPQKAEAH 

III III MM II 1 1 MM II I h' 

40 orflOOa KPS I S ARLVLAKVFDETGEPQKAEAQRNLVLAS VAEENRPSAETHX 

370 380 390 400 



The complete length PRFlOOa nucleotide sequence [<SEQ ID 753>] (SEP ID NP: 753) is: 



1 ATGAAAACGG TAGTCTGGAT TGTCGTCCTG TTTGCCGCCG CNNTCGGGCT 
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51 GGCATTGGCG TCGGGCATTN ACACGGGCGA CGTGTATATC GTACTCGGAC 

101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT 

151 GCCGTCGTGG TGTGGTATTT CCTGTTCAAA TTCATCATCG GCGTACTCAA 

201 TANCCCCGAA AAGATGCAGC GTTTCGGTTC GGCGCGTAAA GGCCGCAAGG 

251 CCGCGCTTGC TTTGAACAAG GCGGGTTTGG CGTATTTTGA AGGGCGTTTT 

3 01 GAAAAGGCGG AACTTGAAGC CTCGCGCGTA TTGGGAAACA AAGAGGCGGG 
351 GGATAACCGG ACTTTGGCAT TGATGTTGGG CGCACATGCC GCCGGGCAGA 
401 TGGAAAACAT CGAGCTGCGC GACCGTTATC TTGCGGAAAT CGCCAAACTG 

4 51 CCGGAAAAGC AGCAGCTTTC CCGTTATCTT TTGTTGGCGG AATCGGCGTT 
501 GAACCGGCGC GATTACGAAG CGGCGGAAGC CAATCTTCAT GCGGCGGCGA 
551 AGATGAATGC CAACCTTACG CGCCTCGTGC GTCTGCAACT TCGTTACGCT 
601 TTCGACAGGG GCGACGCGTT GCAGGTTCTG GCAAAAACCG AAAAANTTTC 
651 CAAGGCGGGC GCGTNGGGCA AATCGGAAAT GGAACGGTAT CAAAATTGGG 
701 CATACCGCCG CCAGCTGNCG GATGCTGCCG ATGCCGCCGC TTTGAAAACC 
751 TGCCTGAAGC GGATTCCCGA CAGCCTCAAA AACGGGGAAT TGAGCGTATC 
801 GGTTGCGGAA AAGTACGAAC GTTTGGGACT GTATGCCGAT GCGGTCAAAT 
851 GGGTCAAACA GCATTATCCG CACAACCGCC GACCCGAACT TTTGGAAGCN 
901 TTTGTCGAAA GCGTGCGCTT TTTGGGCGAA CGCGATCAGC AGAAAGCCAT 
951 CGATTTTGCC GATGCTTGGC TGAAAGAACA GCCCGATAAT GCGCTTCTGC 

1001 TGANGTATCT CGGTCGGCTC GCCTACGGCC GCAAACTTTG GGGCAAGGCA 

1051 AAAGGCTACC TTGAAGCGAG CATTGCATTA AAGCCGAGTA TTTCCGCGCG 

1101 TTTGGTTCTG GCAAAGGTTT TTGACGAAAC CGGAGAACCG CAGAAGGCGG 

1151 AGGCGCAGCG CAACTTGGTT TTGGCAAGCG TTGCCGAGGA AAACCGNCCT 

1201 TCCGCCGAAA CCCATTGA 

This encodes a protein having amino acid sequence [<SEQ ID 754>] (SEP ID NO: 754) : 

1 MKTVWIWL FAAAXGLALA SGIXTGDVYI VLGQTMLRIN LHAFVLGSLI 

51 AVWWYFLFK FIIGV LNXPE KMQRFGSARK GRKAALALNK AGLAYFEGRF 

101 EKAELEASRV LGNKEAGDNR TLALMLGAHA AGQMENIELR DRYLAEIAKL 

151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLQLRYA 

201 FDRGDALQVL AKTEKXSKAG AXGKS EMERY QNWAYRRQLX DAADAAALKT 

251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP HNRRPELLEA 

301 FVESVRFLGE RDQQKAIDFA DAWLKEQPDN ALLLXYLGRL AYGRKLWGKA 

351 KGYLEAS I AL KPSISARLVL AKVFDETGEP QKAEAQRNLV LASVAEENRP 

4 01 SAETH* 

ORFlOOa (SEP ID NO: 754) and ORF100-1 (SEP ID NO: 752) show 95.1% identity in 
overlap: 



10 20 30 40 50 60 

orf 100a. pep MKTWWIWLFAAAXGLALASGIXTGDVY I VLGQTMLRINLHAFVLGSL I AVWWYFLFK 

llllllllllllll MINIM MMMMMMMMMMMMMMMMMM 

orf 100 - 1 MKTWWIWLFAAAVGLALASGI YTGDVY I VLGQTMLRINLHAFVLGSL I AVWWYFLFK 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 100a . pep FI IGVLNXPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR 

MMMI MMMMMMMMMMMMMMMMMMMMMI IMMMI 

orf 100-1 F I IGVLNI PEKMQRFGS ARKGRKAALALNKAGLAYFEGRFEKAELEASRVLVNKEAGDNR 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 100a . pep TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 

M M M M M M M M M M M M M M M M M M M I M M M M M M M M M M I 

orf 100-1 TLALMLGAHAAGQMENIELRDRYLAE I AKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 

130 140 150 160 170 180 
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190 200 210 220 230 240 

orf 100a . pep AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKXSKAGAXGKSEMERYQNWAYRRQLX 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I 
or f 1 0 0 - 1 AAAKI^ANLTRLWLQLRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQLA 

190 200 210 220 230 240 



10 



250 260 270 280 290 300 

orf 100a . pep DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 T 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 i 1 1 ! I M II 1 1 1 : 

orf 100- 1 DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA 

250 260 270 280 290 300 



15 



310 320 330 340 350 360 

orf 100a . pep FVESVRFLGERDQQKAIDFADAWLKEQPDNALLLXYLGRLAYGRKLWGKAKGYLEASIAL 

I I I I II I I I I : II I I I I I I I I I I I I I I I I M II I I II II II II Ml I II I I II I 
orf 100-1 FVES VRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEAS I AL 

310 320 330 340 350 360 



20 



370 380 390 400 

orf 100a . pep KPS ISARLVLAKVFDETGEPQKAEAQRNLVLASVAEENRPSA- ETHX 

II I I I I I I I I I I I I IIIIMIIIIMII :h:::| : I I I . 
orf 100 - 1 KPS I SARLVLAKVFDE IGEPQKAEAQRNLVLEAVSDDERHAALEQHSX 

370 380 390 ' 400 



Homology with a predicted ORF from N.gonorrhoeae 

ORF100 (SEP ID NO: 750) shows 93.3% identity over a 386 aa overlap with a predicted ORF 
(ORF1 OOng) (SEP ID NO: 756) from ^gonorrhoeae: 



25 



30 



35 



40 



orf 100 .pep 
orf 100ng 
orf 100 . pep 
orf lOOng 
orf 100 .pep 
orf lOOng 
orf 100 .pep 
orf lOOng 
orf 100 .pep 
orf lOOng 
orf 100 .pep 
orf lOOng 
orf 100 . pep 
orf lOOng 



MKTWWIWLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVWWYFLFK 

llllll llllllll lllllllllllllllll IIIIIIMilllllllllIMM 

MKTWWIWLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVWWYFLFK 



KPS I S ARLVLTKVFDE I GE PQKAEAH 
I I II II IhlMI :: Mllh 

KPS I PARLVLAKVFDETAQSQKAEAQRNLVLAS VAGENRPSAETR 



386 



405 



60 



60 



120 



FIIGVLNIPEKMQRFGSARKGXKXXLALNKAGLAYFEGRFEKAELEASRVLVNKVGRDNR 

I I I I I I I , I h I : I IIMM I I I I I I I I I I I I I I I I I I I I I I I M M : Ml 

FIIGVLNIPENMRRSGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR 12 0 

TLALMLXAHAAGQMENIXXRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 180 

M II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 

TLALMLGAHAAGQMEN I ELRDRYLAE I AKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 180 

AAAKMNANLTRLVRLX I RYAFDRGDALQVLAKTEKLS KAGALGKSEMERYQNWAYRRQLA 240 

lllllll MINI : 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II I Ml II 1 1 1 1 1 1 1 h! 

AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQMA 24 0 

DAADAAALKTCLKRI PDSLKNGELS VS VAEKYERLGLYADAVKWVKQHYPXNRRPELLEA 300 

II II II III II II II Ml II II I III 1 1 II II I II lllllll II MM II III II II II 

DAADAAALKTCLKRI PDSLKNGELSVS VAEKYERLGLYADAVKWVKQHYPHNRRPELLEA 300 

FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAFGRKLWGKAKGYLEAS I AL 360 

IIMMI III IIIMM IIIMIIIII IIIIMIMIIIMIII Mil III MINI! 

FVESVRFLGEREQQKAIDFADSWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEAS IAL 360 
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The complete length ORFlOOng nucleotide sequence [<SEQ ID 755>] (SEP ID NO: 755) is 

1 ATGAAAACGG TAGTCTGGAT TGTTGTCCTG TTTGCCGCCG CCGTCGGACT 

51 GGCGCTGGCT TCGGGCATTT ACACCGGCGA CGTGTATATC GTACTCGGAC 

101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT 

151 GCCGTCGTGG TGTGGTATTT CCTGTTTAAA TTCATCATCG GCGTACTCAA 

201 TATCCCCGAA AATATGCGGC GTTCCGGTTC GGCGCGGAAA GGCCGCAAGG 

251 CCGCGCTTGC CTTGAATAAG GCGGGTTTGG CGTATTTCGA AGGGCGTTTT 

301 GAAAAGGCGG AACTCGAAGC CTCTCGAGTG TTGGGCAACA AAGAGGCCGG 

351 AGACAACCGG ACTTTGGCAT TGATGCTGGG CGCGCACGCG GCAGGACAGA 

4 01 TGGAAAATAT CGAGCTGCGC GACCGTTATC TTGCGGAAAT CGCCAAACTG 

4 51 CCGGAAAAAC AGCAGCTTTC CCGCTATCTT CTGCTGGCGG AATCGGCGTT 

501 AAACCGGCGC GATTACGAAG CGGCGGAAGC CAATCTTCAT GCGGCGGCGA 

551 AGATGAATGC CAACCTTACG CGCCTCGTGC GTCTGCAACT TCGTTACGCC 

601 TTCGATCGGG GCGATGCGTT GCAGGTTCTG GCAAAAaccG AAAAACTTTC 

651 CAAGGCGGGC GCGTTGGGCA AATCGGAAAT GGAACGGTAT CAAAATTGGG 

701 CATACCGCCG CCAGATGGCG GATGCTGCCG ATGCCGCCGC TTTGAAAACC 

751 TGCCTGAAGC GGATTCCCGA CAGCCTCAAA AACGGGGAAT TGagcGTATC 

801 GGTTGCGGAA AAGTACGAAC GTTTGGGACT GTATGCCGAT GCGGTCAAAT 

851 GGGTCAAACA GCATTATCCG CACAACCGCC GCCCCGAGCT TTTGGAAGCC 

901 TTTGTCGAAA GCGTGCGCTT TTTGGGCGAG CGCGAACAGC AGAAAGCCAT 

951 CGATTTTGCC GATTCTTGGC TGAAAGAACA GCCCGATAAC GCGCTTCTGC 

1001 TGATGTATCT CGGCCGGCTC GCCTACGGCC GCAAACTTTG GGGTAAGGCA 

1051 AAAGGCTACC TTGAAGCGAG TATTGCACTG AAGCCGAGTA TTCCGGCGCG 

1101 TTTGGTGTTG GCAAAGGTTT TTGACGAAAC CGCACAGTCG CAAAAAGCCG 

1151 AAGCACAGCG CAACTTGGTT TTGGCAAGCG TTGCCGGGGA AAACCGCCCT 

12 01 TCCGCCGAAA CCCGTTGA 

This encodes a protein having amino acid sequence [<SEQ ID 756>] (SEP ID NO: 756) : 

1 MKTWWIWL FAAAVGLALA SGIYTGDVYI VLGQTMLRIN LHAFVLGSLI 

51 AVWWYFLFK FIIGV LNIPE NMRRSGSARK GRKAALALNK AGLAYFEGRF 

101 EKAELEASRV LGNKEAGDNR TLALMLGAHA AGQMENIELR DRYLAEIAKL 

151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLQLRYA 

201 FD RGDA LQVL AKTEKLSKAG ALGKS EMERY QNWAYRRQMA DAADAAALKT 

251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP HNRRPELLEA 

3 01 FVESVRFLGE REQQKAIDFA DSWLKEQPDN ALLLMYLGRL AYGRKLWGKA 

3 51 KGYLEASIAL KPSIPARLVL AKVFDETAQS QKAEAQRNLV LASVAGENRP 

4 01 SAETR* 

PRFlOOng (SEP ID NP: 756) and PRF100-1 (SEP ID NP: 752) show 95.3% identity in 
overlap: 



10 20 30 40 50 60 

orf 100-1. pep MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK 

1 1 1 1 1 1 1 II 1 1 1 1 1 M I 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 

orf 100ng MKTVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK 

10 20 30 40 50 60 

70 80 90 100 110 120 

or f 1 0 0 - 1 . pep FI IGVLNIPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLVNKEAGDNR 

I | I I I I I M I : I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ij I I I I I I I I 
orf lOOng FIIGVLNIPENMRRSGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR 

70 80 90 100 110 120 
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130 140 150 160 170 180 

orf 100-1 .pep TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 

II I II II M I II 1 1 1 1 II M 1 1 1 1 II 1 1 ! II II III II M Ml MM Hill 1 1 1 II 

orf lOOng TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 
5 130 140 150 160 170 180 

190 200 210 220 230 240 

orf 100-1 pep AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLS KAGALGKS EMERYQNWAYRRQLA 

IIMMIM llllll IMIMMMMIIMMMMMI MIMMMMIMM 

or f 1 0 Ong AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLS KAGALGKS EMERYQNWAYRRQMA 

10 190 200 210 220 230 240 

250 260 270 280 290 300 

orf 100-1. pep DAADAAALKTCLKRI PDSLKNGELSVS VAEKYERLGLYADAVKWVKQHYPHNRRPELLEA 

M llllll IMMIMMMMMMMMMMMIMI IIIIIIMIMIIMM 

orf 1 0 Ong DAADAAALKTCLKRI PDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA 

15 250 260 270 280 290 300 

310 320 330 340 350 360 

or f 1 0 0 - 1 pep FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEAS I AL 

II I II II II I II II II I Nihil II III II II II III M II I M II M I M MM I Ml 

orf 100ng FVESVRFLGEREQQKAIDFADSWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEASIAL 
20 310 320 330 340 350 360 

370 380 390 -400 

orf 100- 1 . pep KPSISARLVLAKVFDEIGEPQKAEAQRNLVLEAVSDDERHAALEQHSX 

MM MINIUM IMIll I I I I I M: I I 

orf 1 0 On KPS I PARLVLAKVFDETAQSQKAEAQRNLVLASVAGENRPSAETRX 

25 370 380 390 400 

Based on this analysis, including the presence of a putative leader sequence, a putative 
transmembrane domain, and a RGD motif, it is predicted that the proteins from N. meningitidis and 
N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
30 raising antibodies. 

Example 90 

The following DNA sequence, believed to be complete, was identified in N. meningitidis [<SEQ ID 
757>] (SEP ID NO: 757) 

1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG 

35 51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA 

101 TTGATGTGCC GCGCGGCAAT CCCGAGTATG TGCGTCTGTC GGGCATGGCG 

151 GTGCGGCTGT ACCGTTTTAT GTCGCCGTTG GGCTTCGGCG CGGTCGTGTT 

201 CGGCGCGGCG ATACCGTTTG CCGCCGGCTG GTGGGGCAGC GGCTGGGTAC 

2 51 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTACCA GTTGTATTGC 

40 301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG 

351 CTGGTACCGC GTGTTCAACG AAATCCCCGT GCTGCTGATG GTTGCCGCGC 

4 01 TGTATsTGGT CGTGTTCAAA CCGTTTTGA 
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This corresponds to the amino acid sequence [<SEQ ID 758; ORF102>] fSEO ID NO: 758; 
PRF102) : 



1 MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDVPRGN PEYVRLSGMA 
51 VRLYRFMSPL GFGAWFGAA IPFAAGWWGS GWVHVKLCLG LMLLAYQLYC 
101 GVLLRRFQDY SNAFSHRWYR VFNEIPVLLM VAALYXWFK PF* 

Further work revealed the complete nucleotide sequence [<SEQ ID 759>] (SEP ID NO: 759) : 



1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG 

51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA 

101 TTGATGTGCC GCGCGGCAAT CCCGAGTATG TGCGTCTGTC GGGCATGGCG 

151 GTGCGGCTGT ACCGTTTTAT GTCGCCGTTG GGCTTCGGCG CGGTCGTGTT 

201 CGGCGCGGCG ATACCGTTTG CCGCCGGCTG GTGGGGCAGC GGCTGGGTAC 

251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTACCA GTTGTATTGC 

301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG 

351 CTGGTACCGC GTGTTCAACG AAATCCCCGT GCTGCTGATG GTTGCCGCGC 

4 01 TGTATCTGGT CGTGTTCAAA CCGTTTTGA 

This corresponds to the amino acid sequence [<SEQ ID 760; ORF102-1>] (SEP ID NO: 760: 
PRF102-1) : 



1 MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDVPRGN PEYVRLSGMA 
51 VRLYRFMSPL GFGAWFGAA IPFAAGWWGS GWVHVKLCLG LMLLAYQLYC 
101 GVL LRRFQDY SNAFSHRWYR VFNE IPVLLM VAALYLWFK P F* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with HP1484 hypothetical integral membrane protein of H. pylori (accession number 
AE000647) (SEP ID NO: 1 160) 



ORF102 (SEP ID NO: 758) and HP1484 (SEP ID NO: 1160) show 33% aa identity in 143aa 
overlap: 



orf 102 


3 


FSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPLGF 


62 






F W K FH+ VISW A LFYLPR+FV A + V++ +LY F++ 




HP1484 


8 


FLWVKAFHVIAVISWMAALFYLPRLFVYHAENAHKKEFVGVVQIQEK- - KLYSFIASPAM 


65 


orf 102 


63 


GAWFGAAIPFAAG- - -WWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWY 


119 






G + + + GW+H KL L ++LLAY YC +R + + R+Y 




HP1484 


66 


GFTLITGILMLLIEPTLFKSGGWLHAKLALVVLLLAYHFYCKKCMRELEKDPTRRNARFY 


125 


orf 102 


120 


RVFNEIPXXXXXXXXXXXXFKPF 142 








RVFNE P KPF 




HP1484 


126 


RVFNEAPTILMILIVILVWKPF 14 8 





Homology with a predicted PRF from N. meningitidis (strain A) 
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ORF102 (SEP ID NO: 758 s ) shows 99.3% identity over a 142aa overlap with an ORF (ORF102a) 
rSEO ID NO: 762) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 102 .pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL 

III lllllll MIMIMIII IIIIIIMIII MIIIIIIIMMIIIIIIII lllllll 
orf 102a mmfswfklfhlffviswfaglfylprifvnmamidvprgnpeyvrlsgmatolyrfmspl 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 102 .pep GFGAWFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR 

1 1 1 1 ! II 1 1 1 1 1 M 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 Ml I M 1 1 M 

orf 102a GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR 

70 80 90 100 110 120 

130 140 
orf 102 . pep VFNE I PVLLMVAALYXWFKPFX 
Mill Illlllll lllllll 
or f 1 0 2a VFNE I PVLLMVAALYLWFKPFX 

130 140 

The complete length ORF102a nucleotide sequence [<SEQ ID 761>] (SEP ID NO: 761) is: 

1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG 

51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA 

101 TTGATGTGCC GCGCGGCAAT CCCGAGTATG TGCGTCTGTC GGGCATGGCG 

151 GTGCGGCTGT ACCGTTTTAT GTCGCCGTTG GGCTTCGGCG CGGTCGTGTT 

201 CGGCGCGGCG ATACCGTTTG CCGCCGGCTG GTGGGGCAGC GGCTGGGTAC 

251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTACCA GTTGTATTGC 

301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG 

351 CTGGTACCGC GTGTTCAACG AAATCCCCGT GCTGCTGATG GTTGCCGCGC 

401 TGTATCTGGT CGTGTTCAAA CCGTTTTGA 

This encodes a protein having amino acid sequence [<SEQ ID 762>] (SEP ID NO: 762) : 

1 MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDVPRGN PEYVRLSGMA 
51 VRLYRFMSPL GFGAWFGAA IPFAAGWWGS GWVHVKLCLG LMLLAYQLYC 
101 GVL LRRFQDY SNAFSHRWYR VFNE I PVLLM VAALYLWFK P F* 

ORF102a (SEP ID NO: 762) and ORF102-1 (SEP ID NP: 760) show complete identity in 142 aa 
overlap: 

10 20 30 40 50 60 

orf 102a. pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL 

1 1 1 1 1 II I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 102 - 1 MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMA VRLYRFMSPL 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 102a . pep GFGAWFGAA I PFAAGWWGSGWVHVKXCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR 

I M II 1 1 1 1 M 1 1 M I II I M 1 1 II 1 1 1 M II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M II I M 1 1 1 

orf 102 - 1 GFGAWFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR 

70 80 90 100 110 120 
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130 140 
orf 102a. pep VFNE I PVLLMVAALYLWFKPFX 

IMIIIMI IMIMMMI 
or f 1 0 2 - 1 VFNE I PVLLMVAALYLWFKPFX 

130 140 

Homology with a predicted ORF from N. gonorrhoeae 

ORF102 (SEP ID NO: 758) shows 97.9% identity over a 142 aa overlap with a predicted ORF 
(ORF102ng) (SEP ID NO: 764) from N. gonorrhoeae: 

orf 102 .pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL 60 

III MINIM IMMII IMIIIIIIIMIIIMMII IMIillllllMIMIM 

orf 102ng MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDAPRGNPEYVRLSGMAVRLYRFMSPL 60 

orf 102 .pep GFGAWFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR 120 

IIIIMIIIII II MINI llllllllllllll llllllllllll IMMII 

or f 1 0 2 ng GFGAWFGAAI P FAAGRWGSGWVHVKLCLGLMLLAYQL YCGVLLRRFQD YSNAFS HRWYR 120 

orf 102. pep VFNE I PVLLMVAAL YXWFKPF 142 

IIIIIIIIIIIIIII MINI 
orfl02ng VFNE I PVLLMVAAL YLWFKP F 142 

The complete length PRF102ng nucleotide sequence [<SEQ ID 763>] (SEP ID NP: 763) is: 

1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG 

51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA 

101 TTGATGCGCC GCGCGGCAAT CCCGAGTATG TGCGCCTGTC GGGGATGGCG 

151 GTGCGGTTGT ACCGTTTTAT GTCGCCTTTG GGTTTCGGCG CGGTCGTGTT 

2 01 CGGCGCGGCG ATACCGTTTG- CCGCcggccg GTGGGGCagc ggctggGTTC 

2 51 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTATCA GTTGTATTGC 

3 01 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG 
351 ! CTGGTACCGC GTGTTCAAcg aAATCCCCGT GCTGCTGATG GTTGCCGCGC 

4 01 TGTATCTGGT CGTGTTCAAA CCGTTTTGA 

This encodes a protein having amino acid sequence [<SEQ ID 764>] (SEP ID NP: 764) : 

1 MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDAPRGN PEYVRLSGMA 
51 VRLYRFMSPL GFGAWFGAA IPFAAGRWGS GWVHVKLCLG LMLLAYQLYC 
101* GVL LRRFQDY SNAFS HRWYR VFNE I PVLLM VAALYLWFK P F* 

PRF102ng (SEP ID NP: 764) and PRF102-1 (SEP ID NP: 760) show 98.6% identity in 142 aa 
overlap: 

10 20 30 40 50 60 

orf 102-1 .pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II M I II I I I I I I I I M M M I M 1 1 
orf 102ng MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDAPRGNPEYVRLSGMAVRLYRFMSPL 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 102 - 1 . pep GFGAWFGAAI PFAAGWWGSGWVHVKLCLGLMLLAYQL YCGVLLRRFQD YSNAFS HRWYR 
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or f 1 02ng GFGAWFGAAI PFAAGRWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR 

70 80 90 100 110 120 

130 140 
orf 102-1 .pep VFNEI PVLLMVAALYLWFKPFX 

IIIIIIIIIIIIIIIMIIIIII 

or f 1 0 2 ng VFNE I PVLLMVAAL YL WFKP FX 

130 140 

In addition, ORF102ng (SEP ID NO: 764) shows significant homology to a membrane protein 
rSEOIDNO: 1160) from H.pylori: 



gi | 2314656 (AE000647) conserved hypothetical integral membrane protein 
[Helicobacter pylori] Length = 148 
Score = 79.2 bits (192), Expect = le-14 

Identities = 50/147 (34%), Positives = 68/147 (46%), Gaps = 13/147 (8%) 



Query: 


3 


FSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDAPRGNPEYVRLSGMAVRLYRFMSPLGF 


62 






F W K FH+ VISW A LFYLPR+FV A + V++ +LY F+ + 




Sbjct : 


8 


FLWVKAFHVIAVISWMAALFYLPRLFVYHAENAHKKEFVGVVQIQEK--KLYSFIASPAM 


65 


Query: 


63 


GAWFGAAI P FAAGRWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFS 


115 






G + + F +G GW+H KL L ++LLAY YC +R + + 




Sbjct: 


66 


GFTLITGILMLLIEPTLFKSG GWLHAKLALWLLLAYHFYCKKCMRELEKDPTRRN 


121 


Query: 


116 


HRWYRVFNEIPXXXXXXXXXXXXFKPF 142 








R+YRVFNE P KPF 




Sbjct : 


122 


ARFYRVFNEAPTILMILIVILVWKPF 14 8 





Based on this analysis, it is predicted that these proteins from N .meningitidis and N. gonorrhoeae, 
and their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 91 

The following partial DNA sequence was identified in ^meningitidis [<SEQ ID 765>] (SEO ID 
NO: 765) : 



1 


ATGGCAAAAA 


TGATGAAATG 


51 


GGTTTGGGGC 


GGATGGTCTT 


101 


TTACGGAAAC 


GGTCAGGCGC 


//.. 


ATTTCGTTTA 


CGATTTTGTC 


51 


CGACAGCGTC 


GACCCCGGGC 


101 


GCAGTACGGA 


TACGGCTTCC 


151 


GTGCCGAATC 


CGGACGGCAA 


201 


GGTTGAAATC 


GACGGCGTGA 


251 


TGAAAAATCG 


CGGCGGCAAG 


301 


AAGGCGGCGG 


AACGCGAAAT 


351 


CGAAGTAAAA 


AGCGGGTTGA 


401 


TAACCGCCGC 


CGAGCAACAG 


451 


CCGCGCCGAT 


AA 



GGCGGCTGTT GCGGCGGTCG CGGCGGCAGC 
AACTGAAGCC CGAGCCGCAC GTGCTTGATA 
GGC // 

CGAACCGGAT ACGCCGATTA AGGCGAAGCT 
TGACCACGAT GTCGTCGGGC GGTTACAACA 
AATGCGGTCT ACTATTATGC CCGTTCGTTT 
ACTCGCCACG GGGATGACGA CGCAGAATAC 
AAAATGTGCT GATTATTCCG TCGCTGACCG 
GCGTTTGTGC GCGTGTTGGG TGCGGACGGC 
CCGGACCGGT ATGAGAGACA GTATGAATAC 
AAGAGGGGGA CAAAGTGGTC ATCTCCGAAA 
GAAAGCGGCG AACGCGCCCT AGGCGGCCCG 
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This corresponds to the amino acid sequence [<SEQ ID 766; ORF85>] fSEO ID NO: 766; 
PRF85) : 

1 MAKMMKWAAV AAVAAAA VWG GWS.LKPEPH VLDITETVRR G 

51 

5 ioi 

151 f 

201 I SFTILSEPDT 

251 PIKAKLDSVD PGLTTMSSGG YNSSTDTASN AVYYYARSFV PNPDGKLATG 

301 MTTQNTVEID GVKNVLIIPS LTVKNRGGKA FVRVLGADGK AAEREIRTGM 

10 351 RDSMNTEVKS GLKEGDKWI SEITAAEQQE SGERALGGPP RR* 

Further work revealed the further partial nucleotide sequence [<SEQ ID 767>] (SEP ID NO: 767) : 



1 . . GTATCGGTCG GCGCGCAGGC ATCGGGGCAG ATTAAGATAC TTTATGTCAA 

51 ACTCGGGCAA CAGGTTAAAA AGGGCGATTT GATTGCGGAA ATCAATTCGA 

15 101 CCTCGCAGAC CAATACGCTC AATACGGAAA AATCCAAGTT GGAAACGTAT 

151 CAGGCGAAGC TGGTGTCGGC ACAGATTGCA TTGGGCAGCG CGGAGAAGAA 

201 ATATAAGCGT CAGGCGGCGT TATGGAAGGA AAACGCGACT TCCAAAGAGG 

251 ATTTGGAAAG CGCGCAGGAT GCGTTTGCCG CCGCCAAAGC CAATGTTGCC 

301 GAGCTGAAGG CTTTAATCAG ACAGAGCAAA ATTTCCATCA ATACCGCCGA 

20 351 GTCGGAATTG GGCTACACGC GCATTACCGC AACGATGGAC GGCACGGTGG 

4 01 TGGCGATTCT CGTGGAAGAG GGGCAGACTG TGAACGCGGC GCAGTCTACG 

4 51 CCGACGATTG TCCAATTGGC GAATCTGGAT ATGATGTTGA ACAAAATGCA 

501 GATTGCCGAG GGCGATATTA CCAAGGTGAA GGCGGGGCAG GATATTTCGT 

551 TTACGATTTT GTCCGAACCG GATACGCCGA TTAAGGCGAA GCTCGACAGC 

25 601 GTCGACCCCG GGCTGACCAC GATGTCGTCG GGCGGTTACA ACAGCAGTAC 

651 GGATACGGCT TCCAATGCGG TCTACTATTA TGCCCGTTCG TTTGTGCCGA 

701 ATCCGGACGG CAAACTCGCC ACGGGGATGA CGACGCAGAA TACGGTTGAA 

751 ATCGACGGCG TGAAAAATGT GCTGATTATT CCGTCGCTGA CCGTGAAAAA 

801 TCGCGGCGGC AAGGCGTTTG TGCGCGTGTT GGGTGCGGAC GGCAAGGCGG 

30 851 CGGAACGCGA AATCCGGACC GGTATGAGAG ACAGTATGAA TACCGAAGTA 

901 AAAAGCGGGT TGAAAGAGGG GGACAAAGTG GTCATCTCCG AAATAACCGC 

951 CGCCGAGCAA CAGGAAAGCG GCGAACGCGC CCTAGGCGGC CCGCCGCGCC 

1001 GATAA 

35 This corresponds to the amino acid sequence [<SEQ ID 768; ORF85-l>] (SEP ID NO: 768; 
ORF85-1) : 

1 . . VSVGAQASGQ IKILYVKLGQ QVKKGDLIAE INSTSQTNTL NTEKSKLETY 

51 QAKLVSAQIA LGSAEKKYKR QAALWKENAT SKEDLESAQD AFAAAKANVA 

101 ELKALIRQSK ISINTAESEL GYTRITATMD GTWAILVEE GQTVNAAQST 

40 151 PTIVQLANLD MMLNKMQIAE GDITKVKAGQ DISFTILSEP DTPIKAKLDS 

201 VDPGLTTMSS GGYNSSTDTA SNAVYYYARS FVPNPDGKLA TGMTTQNTVE 

2 51 IDGVKNVLII PSLTVKNRGG KAFVRVLGAD GKAAEREIRT GMRDSMNTEV 

301 KSGLKEGDKV VISEITAAEQ QESGERALGG PPRR* 

45 Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N.meninsitidis (strain A) 
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ORF85 (SEP ID NO: 766) shows 87.8% identity over a 41 aa overlap and 99.3% identity over 
153aa overlap with an ORF (ORF85a) (SEP ID NO: 770) from strain A of N. meningitidis: 

10 20 30 40 

or f 8 5 . pep MAKMM KW AA V AA VAAAA VWGG W S - LKPEPHVLDITETVRRG 

1 1 . 1 1 1 1 1 1 1 : 1 1 1 1 1 1 ill! HUM 

orf 85a MAKMM KWAAVAAVAAAAVWGGWSYLKPEPQAAY I TETVRRGD I S RTVS ATGE I S PSNLVS 

10 20 30 40 50 60 

// 

80 90 100 

orf 85 pep ISFTILSEPDTPIKAKLDSVDPGLTTMSSG 

1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 U M 1 1 1 1 1 1 

orf 85a T I VQLANLDMMLNKMQ I AEGD I TKVKAGQD ISFTILSEPDTPIKAKLDSVDPGLTTMSSG 

210 220 230 240 250 260 



110 120 130 140 150 160 

orf 85 . pep GYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLI I PSLTVKNRGGK 
I I I I I III I I I I II I I I I I I I ! I I I I I I I I I I I I I I I I I I I I M H M I I I I I I I I I h 
orf 8 5a GYNSSTDTASNAVYYYARS FVPNPDGKLATGMTTQNTVEIDGVKNVLI I PS LTVKNRGGR 

270 280 290 300 310 320 



170 180 190 200 210 220 

orf 85 . pep AFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKWISEITAAEQQESGERALGGP 

llllllllilllllllllll llllll llllll lllllllllllllllllllllMI 
or f 8 5a AFVRVLGADGKAAERE I RTGMRDSMNTEVKSGLKEGDKWI SE I TAAEQQESGERALGGP 

330 340 350 360 370 380 



230 

orf 85. pep PRRX 
I I I I 

orf 8 5a PRRX 
390 



The complete length PRF85a nucleotide sequence [<SEQ ID 769>] (SEP ID NP: 769) is: 



1 ATGGCAAAAA TGATGAAATG GGCGGCTGTT GCGGCGGTCG CGGCGGCAGC 

51 GGTTTGGGGC GGATGGTCTT ATCTGAAGCC CGAGCCGCAG GCTGCTTATA 

101 TTACGGAAAC GGTCAGGCGC GGCGACATCA GCCGGACGGT TTCTGCAACA 

151 GGGGAGATTT CGCCGTCCAA CCTGGTATCG GTCGGCGCGC AGGCATCGGG 

201 GCAGATTAAG AAACTTTATG TCAAACTCGG GCAACAGGTT AAAAAGGGCG 

2 51 ATTTGATTGC GGAAATCAAT TCGACCTCGC AGACCAATAC GCTCAATACG 
301 GAAAAATCCA AATTGGAAAC GTATCAGGCG AAGCTGGTGT CGGCACAGAT 

3 51 TGCATTGGGC AGCGCGGAGA AGAAATATAA GCGTCAGGCG GCGTTGTGGA 

4 01 AGGATGATGC GACCGCTAAA GAAGATTTGG AAAGCGCACA GGATGCGCTT 
451 GCCGCCGCCA AAGCCAATGT TGCCGAGCTG AAGGCTCTAA TCAGACAGAG 
501 CAAAATTTCC ATCAATACCG CCGAGTCGGA ATTGGGCTAC ACGCGCATTA 
551 CCGCAACGAT GGACGGCACG GTGGTGGCGA TTCTCGTGGA AGAGGGGCAG 
601 ACTGTGAACG CGGCGCAGTC TACGCCGACG ATTGTCCAAT TGGCGAATCT 
651 GGATATGATG TTGAACAAAA TGCAGATTGC CGAGGGCGAT ATTACCAAGG 
701 TGAAGGCGGG GCAGGATATT TCGTTTACGA TTTTGTCCGA ACCGGATACG 
751 CCGATTAAGG CGAAGCTCGA CAGCGTCGAC CCCGGGCTGA CCACGATGTC 
801 GTCGGGCGGC TACAACAGCA GTACGGATAC GGCTTCCAAT GCGGTCTACT 
851 ATTATGCCCG TTCGTTTGTG CCGAATCCGG ACGGCAAACT CGCCACGGGG 
901 ATGACGACGC AGAATACGGT TGAAATCGAC GGTGTGAAAA ATGTGCTGAT 
951 TATTCCGTCG CTGACCGTGA AAAATCGCGG CGGCAGGGCG TTTGTGCGCG 

1001 TGTTGGGTGC AGACGGCAAG GCGGCGGAAC GCGAAATCCG GACCGGTATG 

1051 AGAGACAGTA TGAATACCGA AGTAAAAAGC GGGTTGAAAG AGGGGGACAA 
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1101 AGTGGTCATC TCCGAAATAA CCGCCGCCGA GCAGCAGGAA AGCGGCGAAC 
1151 GCGCCCTAGG CGGCCCGCCG CGCCGATAA 

This encodes a protein having amino acid sequence [<SEQ ID 770>] (SEP ID NO: 770) : 



5 1 MAKMMKWAAV AAVAAAA VWG GWSYLKPEPQ AAYITETVRR GDISRTVSAT 

51 GEISPSNLVS VGAQASGQIK KLYVKLGQQV KKGDLIAEIN STSQTNTLNT 

101 EKSKLETYQA KLVSAQIALG SAEKKYKRQA ALWKDDATAK EDLESAQDAL 

151 AAAKANVAEL KALIRQSKIS INTAESELGY TRITATMDGT WAILVEEGQ 

201 TVNAAQSTPT IVQLANLDMM LNKMQIAEGD ITKVKAGQDI SFTILSEPDT 

10 251 PIKAKLDSVD PGLTTMSSGG YNSSTDTASN AVYYYARSFV PNPDGKLATG 

301 MTTQNTVEID GVKNVLIIPS LTVKNRGGRA FVRVLGADGK AAEREIRTGM 

3 51 RDSMNTEVKS GLKEGDKWI SEITAAEQQE SGERALGGPP RR* 

ORF85a (SEP ID NO: 770) and ORF85-1 (SEP ID NO: 768) show 98.2% identity in 334 aa 
15 overlap: 



30 40 50 60 70 80 

orf 85a . pep PQAAYI TETVRRGD I SRTVS ATGE I S PSNLVS VGAQASGQ I KKLYVKLGQQVKKGDL I AE 

MINIMUM Mill 1 1 1 1 1 M I M I 

orf 85-1 VS VGAQASGQ I KI LYVKLGQQVKKGDL I AE 

20 10 20 30 

90 100 110 120 130 140 

orf 85a . pep INSTSQTNTLNTEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKDDATAKEDLESAQD 
I I I I I I I I I i I I I I I M I I I ■ I I I I I I I I I I I I I I I I I! I I I I I I- I h I I I II I I I I 
orf 85-1 INSTSQTNTLNTEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKENATSKEDLESAQD 
25 40 50 60 70 80 90 

150 160 170 180 190 200 

orf 85a . pep ALAAAKA1STVAELKALIRQSKISINTAESELGYTRITATMDGTVVAILVEEGQTVNAAQST 
I :| , I I I I I I i I I I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
orf 85-1 AFAAAKANVAELKALIRQSKISINTAESELGYTRITATMDGTWAILVEEGQTVNAAQST 
30 100 110 120 130 140 150 

210 220 230 240 250 260 

orf 85a . pep PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS 

1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 i 1 1 1 1 III 1 . 1 1 1 1 

orf 85-1 PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS 
35 160 170 180 190 200 210 

270 280 290 300 310 320 

orf 85a . pep GGYNSSTDTASNAVYYYARS FVPNPDGKLATGMTTQNTVEIDGVKNVLI IPSLTVKNRGG 
M I I I I I I I I I I I I I I I I I I I I I I I I h I I I I I I I I I I II I I I I I I I I I I II I I i I I I I 
orf 85-1 GGYNSSTDTASNAVYYYARS FVPNPDGKLATGMTTQNTVEIDGVKNVLI IPSLTVKNRGG 

40 220 230 240 250 260 270 

330 340 350 360 370 380 

orf 85a . pep RAFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKWISEITAAEQQESGERALGG 
: I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I M I I I I I I I I II I M I I I 
orf 85-1 KAFVRVLGADGKAAERE I RTGMRDSMNTE VKSGLKEGDKWI SE I TAAEQQESGERALGG 

45 280 290 300 310 320 330 



390 

orf 85a. pep PPRRX 

Mill 

orf85-l PPRRX 
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Figure 19D shows plots of hydrophilicity, antigenic index, and AMPHI regions for ORF85a (SEP 
ID NO: 770) . 

Homology with a predicted ORF from N.zonorrhoeae 

ORF85 (SEP ID NO: 766) shows a high degree of identity with a predicted ORF (ORF85ng) (SEP 
ID NO: 772) from N. gonorrhoeae: 



ORF85 1 MAKMMKWAAVAAVAAAAVWGGWS . LKPEPHVLDITETVRRG 4 0 

I I I II I I I I I I I I I I I I I I I I I I I I I I I : llhllll 
ORF85ng 1 MAKMM KWAAVAAVAAAAVWGGWS YLKP E PQAAY I TEAVRRGD I S RTVS AT 50 



ORF85 ISFTILSEPDT 250 

MINIUM 

ORF85ng 201 TVNAAQS TPT I VQLANLDMMLNKMQ I AEGD I TKVKAGQD ISFTILSEPDT 250 

ORF85 251 PIKAKLDSVDPGLTTMSSGGYNSSTDTASNAVYYYARSFVPNPDGKLATG 300 

I II III II I lllllll IM N I 1 1 MM III II I IMIIIIIIMI II 

ORF85ng 251 P I KAKLDS VDPGLTTMS SGGYNS S TDTASNAVYY YARS FVPNPDGKLATG 300 

ORF85 301 MTTQNTVE IDGVKNVL 1 1 PS LTVKNRGGKAFVRVLGADGKAAERE I RTGM 350 

MIMIIMMMMM I IMIIIIIIIMIIIIIII Mil II II 

ORF85ng 301 MTTQNTVE I DGVKNVLL I P S LTVKNRGGKAF VR VLGADGKAVERE I RTGM 350 

ORF85 152 RDSMNTEVKSGLKEGDKWI SE I TAAEQQESGERALGGPPRR 393 

M M M II I I I I I I I I I M M I I Mill I I I I I I I II I I I 
ORF85ng 351 KDSMNTEVKSGLKEGDKWISE I TAAEQQESGERALGGPPRR 393 



The complete length ORF85ng nucleotide sequence [<SEQ ID 77 1>] (SEP ID NO: 771) is: 



ATGGCAAAAA TGATGAAATG GGCGGCTGTT GCGGCGGTCG CGGCGGCaac 
GGTTTGGGGC GGATGGTCTT ATCTGAAGCC CGAACCGCAG GCTGCTTATA 
TTACGGAaac ggTCAGGCGC GGCGATATCA GCCGGACGGT TTGCGCGACG 
GgcgAGATTT CGCCGTCCAA CCTGGTATCG GTCGGCGCGC AGGCTTCGGG 
GCAGATTAAA AAGCTTTATG TCAAACTCGG GCAACAGGTC AAAAAGGGCG 
ATTTGATTGC GGAAATCAAT TCGACCACGC AGACCAACAC GATCGATATG 
GAAAAATCCA AATTGGAAAC GTATCAGGCG AAGCTGGTGT CGGCACAGAT 
TGCATTGGGC AGCGCGGAGA AGAAATATAA GCGTCAGGCG GCGTTGTGGA 
AGGATGATGC GACCTCTAAA GAAGATTTGG AAAGCGCGCA GGATGCGCTT 
GCCGCCGCCA AAGCCAATGT TGCCGAGTTG AAGGCTTTAA TCAGACAGAG 
CAAAATTTCC ATCAATACCG CCGAGTCGGA TTTGGGCTAC ACGCGCATTA 
CCGCGACGAT GGACGGCACG GTGGTGGCGA TTCCCGTGGA AGAGGGGCAG 
ACTGTGAACG CGGCGCAGTC TACGCCGACG AT TGTCCAAT TGGCGAATCT 
GGATATGATG TTGAACAAAA TGCAGATTGC CGAGGGCGAT ATTACCAAGG 
TGAAGGCGGG GCAGGATATT TCGTTTACGA TTTTGTCCGA ACCGGATACG 
CCGATTAAGG CGAAGCTCGA CAGCGTCGAC CCCGGGCTGA CCACGATGTC 
GTCGGGCGGC TACAACAGGA GTACGGATAC GGCTTCCAAT GCGGTCTATT 
ATTATGCCCG TTCGTTTGTG CCGAATCCGG ACGGCAAACT CGCCACGGGG 
ATGACGACGC AGAATACGGT TGAAATCGAC GGTGTGAAAA ATGTGTTGCT 
TATTCCGTCG CTGACCGTGA AAAATCGCGG CGGCAAGGCG TTCGTACGCG 
TGTTGGGTGC GGACGGCAAG GCAGTGGAAC GCGAAATCCG GACCGGTATG 
AAAGACAGTA TGAATACCGA AGTGAAAAGC GGGTTGAAAG AGGGGGACAA 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
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1101 AGTGGTCATC TCCGAAATAA CCGCCGCCGA GCAGCAGGAA AGCGGCGAAC 

1151 GCGCCCTAGG CGGCCCGCCG CGCCGATAA 

This encodes a protein having amino acid sequence [<SEQ ID 772>] (SEP ID NO: 772) : 

5 1 MAKMMKWAAV AAVAAAA VWG GWSYLKPEPQ AAYITEAV RR GD ISRTVSAT 

51 GEISPSNLVS VGAQASGQIK KLYVKLGQQV KKGDLIAEIN STTQTNTIDM 

101 EKSKLETYQA KLVSAQIALG SAEKKYKRQA ALWKDDATSK EDLESAQDAL 

151 AAAKANVAEL KALIRQSKIS INTAESDLGY TRITATMDGT WAIPVEEGQ 

2 01 TVNAAQSTPT IVQLANLDMM LNKMQIAEGD ITKVKAGQDI SFTILSEPDT 

10 251 PIKAKLDSVD PGLTTMSSGG YNSSTDTASN AVYYYARSFV PNPDGKLATG 

301 MTTQNTVEID GVKNVLLIPS LTVKNRGGKA FVRVLGADGK AVEREIRTGM 

351 KDSMNTEVKS GLKEGDKWI SEITAAEQQE SGERALGGPP RR* 



ORF85ng (SEP ID NO: 772) and ORF85-1 (SEP ID NP: 768) show 96.1% identity in 334 aa 
15 overlap: 



30 40 50 60 70 80 

orf 85ng PQAAY I TETVRRGD I SRTVS ATGE I S PSNLVS VGAQASGQ I KKLYVKLGQQVKKGDL I AE 

I II II II II II I I I II II II I II I I MM 
orf 85- 1 VS VGAQASGQ I KI LYVKLGQQVKKGDL I AE 

20 10 20 30 



90 100 110 120 130 140 

orf 85ng INSTTQTNTIDMEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKDDATSKEDLESAQD 

I III: III:: || II I I II I I I II II I I II II II I M II I I II I h M II I II I I II II 
orf 85-1 INSTSQTNTLNTEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKENATSKEDLESAQD 
25 40 50 60 70 80 90 

150 160 170 180 190 200 

orf 85ng ALAAAKANVAELKAL I RQS KI S I NTAESDLGYTR I TATMDGTWAI P VEEGQTVNAAQST 

h 1 1 II 1 1 1 II 1 1 1 M I II 1 1 II II M h II I II I II 1 1 1 II 1 1 II II I llllllll 

orf 85-1 AFAAAKANVAELKALIRQSKISINTAESELGYTRITATMDGTWAILVEEGQTVNAAQST 
30 100 110 120 130 140 150 

210 220 230 240 250 260 

orf 85ng PTIVQIJVNLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS 

1 1 1 1 1 1 1 1 1 II I II 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II II II II 

orf 85-1 PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS 
35 160 170 180 190 200 210 



270 280 290 300 310 320 

orf 85ng GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLLIPSLTVKNRGG 

I I II I I I I I I II II I I I II I II M I I I I II II II II I I I I I II I I II h I II II I II I II 
orf 85-1 GGYNSSTDTASNAVY YYARS FVPNPDGKLATGMTTQNTVE I DGVKNVL 1 1 PSLTVKNRGG 

40 220 230 240 250- 260 270 



330 340 350 360 370 380 

or f 8 5 ng KAFVRVLGADGKAVERE I RTGM KDSMNTEVKS GLKEGDKWI S E I TAAEQQES GERALGG 

I I I I I I I II I I I : I I I II I MM I I I M I I I I I I I I M I I I I I I M I I I I II I I I I I I 
orf 85-1 KAFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKWISEITAAEQQESGERALGG 
45 280 290 300 310 320 330 

390 

orf85ng PPRRX 
Mill 

orf85-l PPRRX 
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In addition, ORF85ng (SEP ID NO: 772) shows significant homology to an E.coli membrane 
fusion protein (SEP ID NO: 1161) : 

gi|l787104 (AE000189) o380; 27% identical , {27 gaps) to 332 residues from membrane 
fusion protein precursor, MTRC_NEIGO SW: P43505 (412 aa) [Escherichia coli] Length 
5 = 380 

Score = 193 bits. (485) , Expect = 2e-48 

Identities = 120/345 (34%), Positives = 182/345 (51%), Gaps = 13/345 (3%) 

Query: 29 PQAAY I TETVRRGD I SRTVS ATGE I S PSNLVS VGAQASGQ I KKL YVKLGQQVKKGDL I AE 88 
P Y T VR GD+ ++V ATG+ + V VGAQ SGQ+K L V +G +VKK L+ 

10 Sbjct: 41 PVPTYQTLIWPGDLQQSVLATGKLDALRKVTDVGAQVSGQLKTLSVAIGDKVKKDQLLGV 100 

Query: 89 INSTTQTNTIDMEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKDDATSKEXXXXXXX 148 

1+ N I ++ L +A+ A+ L A Y RQ L + A S + + 

Sbjct: 101 IDPEQAENQIKEVEATLMELRAQRQQAEAELKLARVTYSRQQRLAQTKAVSQQDLDTAAT 160 

Query: 149 XXXXXXXXXXXXXXXIRQSKISINTAESDLGYTRITATMDGTWAIPVEEGQTVNAAQST 208 
15 I++++ S++TA+++L YTRI A M G V I +GQTV AAQ 

Sbjct: 161 EMAVKQAQ I GT I DAQ I KRNQASLDTAKTNLD YTR I VAPMAGEVTQ I TTLQGQTVI AAQQA 220 

Query: 209 PT I VQLANLDMMLNKMQ I AEGD I TKVKAGQD I S FT I LSE PDTP I KAKLDS VDPGLTTMS S 268 

P 1+ LA++ ML K Q++E D+ +K GQ FT+L + P T + + + VP 
Sbjct: 221 PN I LTLADMS AMLVKAQVS EADVI HLKPGQKAWFT VLGDPLTRYEGQ I KDVLP 273 

20 Query: 269 GGYNSSTDTASNAVYYYARS FVPNPDGKLATGMTTQNTVEIDGVKNVLLIPSLTVKNRGG 328 

+ + ++A++YYAR VPNP+G L MT Q + + + VKNVL IP + + G 
Sbjct: 274 TPEKVNDAIFYYARFEVPNPNGLLRLDMTAQVHIQLTDVKNVLTIPLSALGDPVG 328 

Query: 329 KAFVRV - LGADGKAVERE I RTGMKDSMNTEVKSGLKEGDKWI SE 372 
+V L +G+ ERE+ G ++ + E+ GL+ GD+WI E 
25 Sbjct: 329 DNRYKVKLLRNGETRERE VT IGARNDTDVE I VKGLEAGDE W I GE 373 

Based on this analysis, it was predicted that the proteins from N. meningitidis and N. gonorrhoeae, 
and their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

ORF85-1 (SEP ID NO: 768) (40.4kDa) was cloned in the pGex vectors and expressed in E.coli, as 
30 described above. The products of protein expression and purification were analyzed by SDS- 
PAGE. Figure 19A shows the results of affinity purification of the GST-fusion protein. Purified 
GST-fusion protein was used to immunise mice, whose sera were used for Western blot (Figure 
19B), FACS analysis (Figure 19C), and ELISA (positive result). These experiments confirm that 
GRF85-1 (SEP ID NP: 768) is a surface-exposed protein, and that it is a useful immunogen. 

35 Example 92 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 773>] (SEP ID 
NP: 773) : 
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1 . .ATTCCCGCCA CGATGACATT TGAACGCAGC GGCAATGCTT ACAAAATCGT 

51 TTCGACGATT AAAGTGCCGC TATACAATAT CCGTTTCGAG TCCGGCGGTA 

101 CGGTTGTCGG CAATACCCTG CACCCTACCT ACTATAGAGA CATACGCAGG 

151 GGCAAACTGT ATGCGGAAgc CAAATTCGCC GACgGcAGCG TAACTTACGG 

5 201 CAAAGCGGGC GAGAGCAAAA CCGAGCAAAG CCCCAAGGCT ATGGATTTGT 

251 TCACGCTTGC CTGGCAGTTG GCGGCAAATG ACGCGAAACT CCCCCCGGGG 

301 CTGAAAATCA CCAACGGCAA AAAACTTTAT TCCGTCGGCG GTTTGAATAA 

351 GGCGGGTACA GGAAAATACA GCATAGGCGG CGTGGAAACC GAAGTCGTCA 

4 01 AATATCGGGT GCGGCGCGGC GACGATGCGG TAATGTATTT cTTCGCACCG 

10 4 51 TCCCTGAACA ATATTCCGGC ACAAATCGGC TATACCGACG ACGGCAAAAC 

.501 CTATACGCTG AAACTCAAAT CGGTGCAGAT CAACGGCCAG GCAGCCAAAC 

551 CGTAA 

This corresponds to the amino acid sequence [<SEQ ID 774; ORF120>] fSEO ID NO: 774; 
15 PRF120) : 

1 . . IPATMTFERS GNAYKIVSTI KVPLYNIRFE SGGTWGNTL HPTYYRDIRR 

51 GKLYAEAKFA DGSVTYGKAG ESKTEQSPKA MDLFTLAWQL AANDAKLPPG 

101 LKITNGKKLY SVGGLNKAGT GKYSIGGVET EWKYRVRRG DDAVMYFFAP 

151 SLNNIPAQIG YTDDGKTYTL KLKSVQINGQ AAKP* 

20 

Further work revealed the complete nucleotide sequence [<SEQ ID 775>] (SEP ID NO: 775) : 

1 ATGATGAAGA CTTTTAAAAA TATATTTTCC GCCGCCATTT TGTCCGCCGC 

51 CCTGCCGTGC GCGTATGCGG CAGGGCTGCC CCAATCCGCC GTGCTGCACT 

101 ATTCCGGCAG CTACGGCATT CCCGCCACGA TGACATTTGA ACGCAGCGGC 

25 151 AATGCTTACA AAATCGTTTC GACGATTAAA GTGCCGCTAT ACAATATCCG 

201 TTTCGAGTCC GGCGGTACGG TTGTCGGCAA TACCCTGCAC CCTACCTACT 

251 ATAGAGACAT ACGCAGGGGC AAACTGTATG CGGAAGCCAA ATTCGCCGAC 

3 01 GGCAGCGTAA CTTACGGCAA AGCGGGCGAG AGCAAAACCG AGCAAAGCCC 
351 CAAGGCTATG GATTTGTTCA CGCTTGCCTG GCAGTTGGCG GCAAATGACG 

30 4 01 CGAAACTCCC CCCGGGGCTG AAAATCACCA ACGGCAAAAA ACTTTATTCC 

4 51 GTCGGCGGTT TGAATAAGGC GGGTACAGGA AAATACAGCA TAGGCGGCGT 
501 GGAAACCGAA GTCGTCAAAT ATCGGGTGCG GCGCGGCGAC GATGCGGTAA 
551 TGTATTTCTT CGCACCGTCC CTGAACAATA TTCCGGCACA AATCGGCTAT 
601 ACCGACGACG GCAAAACCTA TACGCTGAAA CTCAAATCGG TGCAGATCAA 

35 651 CGGCCAGGCA GCCAAACCGT AA 

This corresponds to the amino acid sequence [<SEQ ID 776; ORF120-1>] (SEP ID NO: 776: 
ORF120-1) : 



1 MMKTFKNIFS AAILSAALPC AYA AGLPQSA VLHYSGSYGI PATMTFERSG 

40 51 NAYKIVSTIK VPLYNIRFES GGTWGNTLH PTYYRDIRRG KLYAEAKFAD 

101 GSVTYGKAGE SKTEQSPKAM DLFTLAWQLA ANDAKLPPGL KITNGKKLYS 

151 VGGLNKAGTG KYSIGGVETE WKYRVRRGD DAVMYFFAPS LNNIPAQIGY 

201 TDDGKTYTLK LKSVQINGQA AKP* 

45 Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N. meningitidis (strain A) 
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ORF120 (SEP ID NO: 774) shows 92.4% identity over a 184aa overlap with an ORF (ORF120a) 
(SEP ID NO: 778) from strain A of N. meningitidis: 

10 20 30 

orf 120 . pep IPATMTFERSGNAYKIVSTIKVPLYNIRFE 

5 | I | | : II IIIIIIIIMIIIIII 

orf 120a SAAI LSAALPCAYAAGLPXSAVLHYSGSYG I PATXXXXXXXNAXKIVST I KVPLYNI RFE 

10 20 30 40 50 60 

40 50 60 70 80 90 

orf 120 . pep SGGTWGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAMDLFTLAWQL 

10 I MM 1 1 Mill MM III II MM I Mill II II MM = 1 1 II II 1 1 1 II 1 1 1 1 

orf 120a SGGTWGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAXXXXXXQSPKAMDLFTLAWQL 
70 80 90 100 110 120 

100 110 120 130 140 150 

orf 12 0 . pep AANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYS IGGVETEWKYRVRRGDDAVMYFFAP 

15 II I II I M I M I M 1 1 II II I II I II II 1 1 1 1 1 1 1 II 1 1 II M M 1 1 1 1 II 1 1 II 1 1 1 1 1 

orf 12 0a AANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYS IGGVETEWKYRVRRGDDAVMYFFAP 

130 140 150 160 170 180 

160 170 180 

orf 12 0 .pep SLNNI PAQIGYTDDGKTYTLKLKSVQINGQAAKPX 

20 || | || | | || I II I II I II II I II I I I I I II I I I I I 

or f 12 0a SLNNI PAQIGYTDDGKTYTLKLKSVQINGQAAKPX 

190 200 210 220 

The complete length PRF120a nucleotide sequence [<SEQ ID 777>] (SEP ID NP: 777) is: 

25 1 ATGATGAAGA CTTTTAAAAA TATATTTTCC GCCGCCATTT TGTCCGCCGC 

51 CCTGCCGTGC GCGTATGCGG CAGGGCTGCC CNAATCCGCC GTGCTGCACT 

101 ATTCCGGCAG CTACGGCATT CCCGCCACNA NNANNTNNGN ACNNNGNGNC 

151 AATGCTTNCA AAATCGTTTC GACGATTAAA GTGCCGCTAT ACAATATCCG 

2 01 TTTCGAGTCC GGCGGTACGG TTGTCGGCAA TACCCTGCAC CCTACCTACT 
30 2 51 ATAGAGACAT ACGCAGGGGC AAACTGTATG CGGAAGCCAA ATTCGCCGAC 

3 01 , GGCAGCGTAA CCTACGGCAA AGCGGNNNNN ANCNNNNNNG NGCAAAGCCC 

3 51 CAAGGCTATG GATTTGTTCA CGCTTGCNTG GCAGTTGGCG GCAAATGACG 

4 01 CGAAACTCCC CCCGGGGCTG AAAATCACCA ACGGCAAAAA ACTTTATTCC 
4 51 GTCGGCGGTT TGAATAAGGC GGGTACAGGA AAATACAGCA TAGGCGGCGT 

35 501 GGAAACCGAA GTCGTCAAAT ATCGGGTGCG GCGCGGCGAC GATGCGGTAA 

551 TGTATTTCTT CGCACCGTCC CTGAACAATA TTCCGGCACA AATCGGCTAT 

601 ACCGACGACG GCAAAACCTA TACGCTGAAA CTCAAATCGG TGCAGATCAA 

651 CGGCCAGGCA GCCAAACCGT AA 

40 This encodes a protein having amino acid sequence [<SEQ ID 778>] (SEP ID NP: 778) : 

1 MMKTFKNIFS AAILSAALPC AYA AGLPXSA VLHYSGSYGI PATXXXXXXX 

51 NAXKIVSTIK VPLYNIRFES GGTWGNTLH PTYYRDIRRG KLYAEAKFAD 

101 GSVTYGKAXX XXXXQSPKAM DLFTLAWQLA ANDAKLPPGL KITNGKKLYS 

151 VGGLNKAGTG KYSIGGVETE WKYRVRRGD DAVMYFFAPS LNNIPAQIGY 

45 201 TDDGKTYTLK LKSVQINGQA AKP* 

PRF120a (SEP ID NP: 778) and PRF120-1 (SEP ID NP: 776) show 93.3% identity in 223 aa 



overlap: 
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10 20 30 40 50 60 

orf 120a .pep MMKTFKNIFSAAILSAALPCAYAAGLPXSAVLHYSGSYGIPATXXXXXXXNAXKIVSTIK 

I I II I I I I : I I I I I I I I I II Ml I I I I II I I I I I I I I I I I : || IIIIIM 

orf 12 0-1 MMKTFKNI FSAAI LSAALPCAYAAGLPQSAVLHYSGS YGI PATMTFERSGNAYKI VSTI K 

5 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 120a . pep VPLYNIRFESGGTWGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAXXXXXXQSPKAM 

II II I MM III I 1 1 1 Mini llllllll Mill Mill III I II : MINI 

orf 120-1 VPLYNIRFESGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAM 
10 70 80 90. 100 110 120 

130 140 150 160 170 180 

orf 12 0a. pep DL FTLAWQLAANDAKLPPGLK I TNGKKiYS VGGLNKAGTGKYS I GGVETEVVKYRVRRGD 

I M 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 II 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 ■ I I 

orf 12 0-1 DLFTLAWQLAANDAKLP PGLKI TNGKKLYS VGGLNKAGTGKYS I GGVETE WKYRVRRGD 

15 130 140 150 160 170 180 

190 200 210 220 

orf 120a . pep DAVMYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I 
orf 120-1 DAVMYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX 

20 190 200 210 220 

Homology with a predicted ORF from N. gonorrhoeae 

ORF120 (SEP ID NO: 774) shows 97.8% identity over 184 aa overlap with a predicted ORF 
(ORF1 20ng) (SEP ID NO: 780) from N. gonorrhoeae: 

orf 120. pep I PATMTFERSGNAYKI VST I KVPLYN I RFE 30 

25 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

orf 120ng SAAI LSAALPCAYAARLPQSAVLHYSGS YGI PATMTFERSGNAYKI VSTI KVPLYN I RFE 69 

orf 120 .pep SGGTWGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAMDLFTLAWQL 90 

II II I II II I I hi hi II hi II I II II II III II II I MM II I II II II I II II Ml I 
orf 12 0ng SGGTWGNTLHPAYYKDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAMDLFTLAWQL 129 

30 orf 120. pep AANDAKL P PGLKI TNGKKLYS VGGLNKAGTGKYS I GGVETE WKYRVRRGDDAVMYFFAP 150 

III IMIMMIM llllll MIMMIIMM IMMMIIMIMM Mill 

or f 1 2 Ong AANDAKL P PGLKI TNGKKLYS VGGLNKAGTGKYS I GGVETE WKYRVRRGDDTVTYFFAP 189 

orf 120 .pep SLNNI PAQIGYTDDGKTYTLKLKSVQINGQAAKP 184 

1 1 II 1 1 1 1 1 1 1 III 1 1 M 1 1 1 1 M 1 1 II I I M 

35 orf 120ng SLNNI PAQIGYTDDGKTYTLKLKSVQINGQAAKP 223 

The complete length PRF120ng nucleotide sequence [<SEQ ID 779>] (SEP ID NP: 779) is: 

1 ATGATGAAGA CTTTTAAAAA TATATTTTCC GCCGCCATTT TGTCCGCCGC 

51 CCTGCCGTGC GCGTATGCGG CAAGGCTACC CCAATCCGCC GTGCTGCACT 

40 101 ATTCCGGCAG CTACGGCATT CCCGCCACGA TGACATTTGA ACGCAGCGGC 

151 AATGCTTACA AAATCGTTTC GACGATTAAA GTGCCGCTAT ACAATATCCG 

201 TTTCGAATCC GGCGGTACGG TTGTCGGCAA TACCCTGCAC CCTGCCTACT 

2 51 ATAAAGACAT ACGCAGGGGC AAACTGTATG CGGAAGCCAA ATTCGCCGAC 

3 01 GGCAGCGTAA CCTACGGCAA AGCGGGCGAG AGCAAAACCG AGCAAAGCCC 
45 351 CAAGGCTATG GATTTGTTCA CGCTTGCCTG GCAGTTGGCG GCAAATGACG 
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401 
451 
501 
551 
601 
651 



CGAAACTCCC 
GTCGGCGGCC 
GGAAACCGAA 
CGTATTTCTT 
'ACCGACGACG 
CGGACAGGCC 



CCCGGGTCTG 
TGAATAAGGC 
GTCGTCAAAT 
CGCACCGTCC 
GCAAAACCTA 
GCCAAACCGT 



AAAATCACCA 
GGGTACGGGA 
ATCGGGTGCG 
CTGAACAATA 
TACGCTGAAG 
AA 



ACGGCAAAAA 
AAATACAGCA 
GCGCGGCGAC 
TTCCGGCACA 
CTCAAATCGG 



ACTTTATTCC 
TaggCGGCGT 
GATACGGTAA 
AATCGGCTAT 
TGCAGATCAA 



This encodes a protein having amino acid sequence [<SEQ ID 780>] (SEP ID NO: 780) : 

1 MMKTFKNIFS AAILSAALPC AYA ARLPQSA VLHYSGSYGI PATMTFERSG 

10 51 NAYKIVSTIK VPLYNIRFES GGTWGNTLH PAYYKDIRRG KLYAEAKFAD 

101 GSVTYGKAGE SKTEQSPKAM DLFTLAWQLA ANDAKLPPGL KITNGKKLYS 

151 VGGLNKAGTG KYSIGGVETE WKYRVRRGD DTVTYFFAPS LNNIPAQIGY 

201 TDDGKTYTLK LKSVQINGQA AKP* 

15 In comparison with ORF120-1 (SEP ID NO: 776) , ORF120ng fSEO ID NO: 780) shows 97.8% 
identity in 223 aa overlap: 



20 



25 



30 



35 



orf 120-1 .pep 
orf 120ng 



orf 120-1 .pep 
orf 120ng 



orf 120-1 .pep 
orf 12 0ng 

orf 120-1 .pep 
orf 120ng 



10 20 30 40 50 60 

MMKTFKNIFSAAILSAALPCAYAAGLPQSAVLHYSGSYGIPATMTFERSGNAYKIVSTIK 

Illlllllllllllllilllllll IIIMMMIIIIIIIIIMIIIIIIIIIIIIIII 

MMKTFKNIFSAAILSAALPCAYAARLPQSAVLHYSGSYGIPATMTFERSGNAYKIVSTIK 
10 20 30 40 50 60 

70 80 90 100 110 120 

VPLYNIRFESGGTWGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAM 

IIIMIIIIIIIIIIMIIIhlhlllMIIIMIIIIIIIMIIIIIIIIIIIIIMI 

VPLYNIRFESGGTWGNTLHPAYYKDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAM 
70 80 90 100 110 120 

130 140 150 160 170 180 

DLFTLAWQLAANDAKLPPGLKI TNGKKLYSVGGLNKAGTGKYS IGGVETEWKYRVRRGD 

Mllllillllllllil I MM I ill hIMIIIIIMIIIIIIIIIIIIIII 

DLFTLAWQLAANDAKLPPGLKI TNGKKLYSVGGLNKAGTGKYS IGGVETEWKYRVRRGD 
130 140 150 160 170 180 

190 200 210 220 

DAVMYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX 

hi MINIMUM NIU I lllllllllllllllll 
DTVTYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX 
190 200 210 220 



40 



This analysis, including the presence of a putative leader sequence in the gonococcal protein 
suggests that the proteins from ^meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 93 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 78 1>] (SEP ID 
NP: 781) : 
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1 ATGTATCGGA GGAAAGGGCG 

51 . GCGTTTGCC GCCTTGGTCT 

101 CTCCGTTTGC GGTTGCGGCG 

151 GAATGGTTGC AGAAAAAGGG 

201 GATGGTGTTT TCCTTGATTT 

251 CTATGCTGGT CGGGCAGTTC 

301 ATCGGTTTTA TGCAGAACAC 

351 CGGATATGTG GAAATCGATC 

4 01 ATACGGGAGA GTTGAGCAAC 

4 51 AGGCAGGGCG GCAATATT . . 



GGGCATCAAG CCGTGGATGG GTGCCGGTGC 
GGCTGGTTTT CGCGCTCGGC GATACTTTGA 
GTGCTGGCGT ATGTATTGGA CCCTTTGGTC 
TTTGAACCGT GCATCCGCTT CGATGTCTGT 
TGTTGTTGGC ATTATTGTTG ATTATCGTCC 
AACAATTTGG CATCGCGCCT GCCCCAATTA 
GCTGCTGCCG TGGTTGAAAA ATACAATCGG 
AGGCATCTAT TATTGCGTGG CTTCAGGCGC 
GCGCTTAAGG CGTGGTTTCC CGTTTTGATG 



This corresponds to the amino acid sequence [<SEQ ID 782; ORF121>] (SEP ID NO: 782; 
PRF121) : 



1 MYRRKGRGIK PWMGAGXAFA ALVWLVFALG DTLTPFAVAA VLAYVLDPLV 
51 EWLQKKGLNR ASASMSVMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL 
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW LQAHTGELSN ALKAWFPVLM 
151 RQGGNI . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 783>] (SEP ID NO: 783) : 



1 ATGTATCGGA GGAAAGGGCG GGGCATCAAG CCGTGGATGG GTGCCGGTGC 

51 GGCGTTTGCC GCCTTGGTCT GGCTGGTTTT CGCGCTCGGC GATACTTTGA 

101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTATTGGA CCCTTTGGTC 

151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT 

201 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATCGTCC 

2 51 CTATGCTGGT CGGGCAGTTC AACAATTTGG CATCGCGCCT GCCCCAATTA 

301 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG 

351 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG CTTCAGGCGC 

4 01 ATACGGGAGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG 

4 51 AGGCAGGGCG GCAATATTGT CAGCAGTATC GGCAACCTGC TGCTGCTTCC 

501 CTTGCTGCTT TACTATTTCC TGCTGGATTG GCAGCGGTGG TCGTGCGGCA 

551 TTGCCAAACT GGTTCCGAgG CGTTTTGCCG GTGCTTATAC GCGCATTACA 

601 GGCAATTTGA ACGAGGTATT GGGCGAATTT TTGCGCGGGC AGCTTCTGGT 

651 AATGCTGATT ATGGGCTTGG TTTACGGTTT GGGATTGGTG CTGGTCGGGC 

701 TGGATTCGGG GTTTGCCATC GGTATGCTTG CCGGTATTTT GGTGTTTGTC 

751 CCTTATCTCG GGGCGTTTAC GGGATTGCTG CTTGCCACCG TCGCCGCCTT 

801 GCTCCAGTTC GGTTCGTGGA ACGGCATCCT ATCGGTTTGG GCGGTTTTTG 

8 51 CCGTAGGACA GTTTCTCGAA AGTTTTTTCA TTACGCCGAA AATCGTGGGA 

901 GACCGTATCG GGCTGTCGCC GTTTTGGGTT ATCTTTTCGC ' TGATGGCGTT 

951 CGGGCAGCTG ATGGGCTTTG TCGGAATGTT GGCGGGATTG CCTTTGGCCG 

1001 CCGTAACCTT GGTCTTGCTT CGCGAGGGCG TGCAGAAATA TTTTGCCGGC 

1051 AGTTTTTACC GGGGCAGGTA G 

This corresponds to the amino acid sequence [<SEQ ID 784; PRF121-1>] (SEP ID NP: 784; 
PRF121-1) : 



1 MYRRKGRGIK PWMGAGAAFA ALVWLVFALG DTLTPFAVAA VLAYVLDPLV 

5.1 EWLQKKGLNR ASASMS VMVF SLILLLALLL IIV PMLVGQF NNLASRLPQL 

101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW LQAHTGELSN ALKAWFPVLM 

151 RQGGNI VS S I GNLLLLPLLL YYFLL DWQRW SCGIAKLVPR RFAGAYTRIT 

201 GNLNEVLGEF LRGQL LVMLI MGLVYGLGLV LV GLDSGFAI GMLAG ILVFV 

251 PYLGAFTGLL LA TVAALLQF GSWNG ILSVW AVFAVGQFLE SF FITPKIVG 

301 DRIGLSPFWV IFSLMAFGQL MGF VGMLAGL PLAAVTLVLL REGVQKYFAG 

351 SFYRGR* 
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Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. meningitidis (strain A) 

ORF121 (SEP ID NO: 782) shows 98.7% identity over a 156aa overlap with an ORF (ORF121a) 
(SEP ID NO: 786) from strain A of N. meningitidis: 

5 10 20 30 40 50 60 

orf 12 1 . pep MYRRKGRG I KPWMGAGXAFAALWLVFALGDTLTP FAVAAVLAYVLDPLVEWLQKKGLNR 

MMII IIMM II 1 1 1 1 1 1 1 1 i 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 

or f 1 2 la MYRRKGRG I KPWMDAGAAFAALVWLVFALGDTLTP FAVAAVLAYVLDPLVEWLQKKGLNR 

10 20 30 40 50 60 

10 70 80 90 100 110 120 

orf 121 .pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 

MIMIIIIIIII IMMIMIIMIIIMIIIMIII llllll lllllllllllll 

orf 121a AS ASMS VMVFS L I LLLALLL 1 1 VPMLVGQFNNLASRLPQL I GFMQNTLLPWLKNT I GGYV 

70 80 90 100 110 120 

15 130 140 150 

orf 121 . pep EIDQASI IAWLQAHTGELSNALKAWFPVLMRQGGNI 

I I I I I I I I I II I I I I I I I I I I I I I I I I II Ml I 
orf 121a EIDQASI IAWLQAHTGELSNALKAWFPVLMRQGGNIVSSIGNLLLLPLLLYYFLLDWQRW 

130 140 150 160 170 180 

20 orf 121a SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI 

190 200 210 220 230 240 

The complete length PRF121 a nucleotide sequence [<SEQ ID 785>] (SEP ID NP: 785) is: 

1 ATGTATCGGA GGAAAGGGCG GGGCATCAAG CCGTGGATGG ATGCCGGTGC 

25 51 GGCGTTTGCC GCCTTGGTCT GGCTGGTTTT CGCGCTCGGC GATACTTTGA 

101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTATTGGA CCCTTTGGTC 

151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT 

201 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATTGTCC 

251 CTATGCTGGT CGGGCAGTTC AACAATTTGG CATCGCGCCT GCCCCAATTA 

30 3 01 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG 

351 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG CTTCAGGCGC 

401 ATACGGGCGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG 

451. AGGCAGGGCG GCAATATTGT CAGCAGTATC GGCAACCTGC TGCTGCTTCC 

501 CTTGCTGCTT TACTATTTCC TGCTGGATTG GCAGCGGTGG TCGTGCGGCA 

35 551 TTGCCAAACT GGTTCCGAGG CGTTTTGCCG GTGCTTATAC GCGCATTACA 

601 GGCAATTTGA ACGAGGTATT GGGCGAATTT TTGCGCGGGC AGCTTCTGGT 

651 GATGCTGATT ATGGGTTTGG TTTACGGCTT GGGGTTGGTG CTGGTCGGGC 

701 TGGATTCGGG GTTTGCAATC GGTATGGTTG CCGGTATTTT GGTTTTTGTT 

751 CCCTATTTGG GCGCGTTTAC AGGACTGCTG CTGGCAACCG TCGCCGCCTT 

40 801 GCTCCAGTTC GGTTCGTGGA ACGGCATCTT GGCTGTTTGG GCGGTTTTTG 

851 CCGTAGGACA GTTTCTCGAA AGTTTTTTCA TTACGCCGAA AATCGTGGGA 

901 GACCGTATCG GCCTGTCGCC GTTTTGGGTT ATCTTTTCGC TGATGGCGTT 

951 CGGGCAGCTG ATGGGCTTTG TCGGAATGTT GGCCGGATTG CCTTTGGCCG 

1001 CCGTAACCTT GGTCTTGCTT CGCGAGGGCG TGCAGAAATA TTTTGCCGGC 

45 1051 AGTTTTTACC GGGGCAGGTA G 



This encodes a protein having amino acid sequence [<SEQ ID 786>] (SEP ID NP: 786) : 
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1 MYRRKGRGIK PWMDAGAAFA ALVWLVFALG DTLTPFAVAA VLAYVLDPLV 

51 EWLQKKGLNR ASASMS VMVF SLILLLALLL IIV PMLVGQF NNLASRLPQL 

101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW LQAHTGELSN ALKAWFPVLM 

151 RQGGNIVS SI GNLLLLPLLL YYFLL DWQRW SCGIAKLVPR RFAGAYTRIT 

5 201 GNLNEVLGEF LRGQL LVMLI MGLVYGLGLV LV GLDSGFAI GMVAG ILVFV 

2 51 PYLGAFTGLL LA TVAALLQF GSWNGILAVW AVFAVGQFLE SF FITPKIVG 

301 DRIGLSPFWV IFSLMAFGQL MGF VGMLAGL PLAAVTLVLL REGVQKYFAG 

351 SFYRGR* 

10 ORF121a rSEO ID NO: 786) and ORF121-1 (SEP ID NO: 784) show99.2% identity in 356 aa 
overlap: * 



10 20 30 40 50 60 

or f 12 la . pep MYRRKGRG I KPWMDAGAAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR 

1 1 1 1 ' 1 1 1 1 1 j 1 1 l-IIIMI IIIIIIMIIMI llllll'IIIMMIMIIIM 

15 or f 1 2 1 - 1 MYRRKGRG I KPWMGAGAAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 12 la. pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 
I I I I I I I M I I I I I I ! I I I I I I 1 I I I I I I ! I II I I I ! I I I I I I I I I I I I M I I I I M I 
20 orf 12 1-1 ASASMSVMVFSLILLLALLLI I VPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 121a. pep EIDQASI IAWLQAHTGELSNALKAWFPVLMRQGGNIVSS IGNLLLLPLLLYYFLLDWQRW 

1 1 II I M 1 1 1 1 1 1 M I 1 1 1 1 1 U 1 1 1 1 1 1 1 1 1 1 II ; I II 1 1 U 1 1 1 1 1 1 1 1 1 1 1 1 II 

25 orf 121-1 EIDQASI I AWLQAHTGELSNALKAWFPVLMRQGGN I VSS IGNLLLLPLLLYYFLLDWQRW 

130 140 150 160 170 180 



190 200 210 220 230 240 

orf 12 la. pep SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI 
II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I 
30 or f 12 1 - 1 SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI 

190 200 210 220 230 240 



250 260 270 280 290 300 

orf 12 la . pep GMVAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILAVWAVFAVGQFLESFFITPKIVG 
I U I I I I I ' I I I I I I I I I I I I I I I I I IM I I II I I I M I I I I I I I II I I I I I I I I I I 
35 orf 121-1 GMLAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILSVWAVFAVGQFLESFFITPKIVG 

250 260 270 280 290 300 



310 320 330 340 350 

orf 12 la. pep DRIGLSPFWVIFSLMAFGQLMGFVGMLAGLPLAAVTLVLLREGVQKYFAGSFYRGRX 

I ! 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M I 

40 orf 121-1 DRIGLSPFWVIFSLMAFGQLMGFVGMLAGLPLAAVTLVLLREGVQKYFAGSFYRGRX 

310 320 330 340 350 



Homology with a predicted ORF from N. gonorrhoeae 



ORF121 (SEP ID NO: 782) shows 97.4% identity over a 156 aa overlap with a predicted ORF 
(ORF121ng) (SEP ID NO: 788) from N. gonorrhoeae: 
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or f 1 2 1 . pep MYRRKGRGI KPWMGAGXAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR 6 0 

IIIIIIIIIIIMIII MMIMIMIIIMMII IIIMM IIMMIMI MINI 

or f 1 2 1 ng M YRRKGRG I KPWMGAGAAFAALVWLVYALGDTLT P FAVAAVLAYVLDPLVEWLQKKGLNR 6 0 

i 

orf 121. pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 120 

I I I I ! I I MTI : I II I I I I I I I I I I I I II I I , I I I i I I II I I I I I I M i I I I I I I M 

orf 121ng ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 120 

orf 121 .pep EIDQAS I IAWLQAHTGELSNALKAWFPVLMRQGGNI 156 

I I I I I I I h i I I I I I I I I I I I I I I I I i hi I I I 

orf 121ng EIDQASIIAWFQAHTGELSNALKAWFPVLMKQGGNIVSTIGNLLLPPLLLYYFLLDWHRW 180 

An ORF121ng nucleotide sequence [<SEQ ID 787>] (SEP ID NO: 787) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 788>] (SEP ID NO: 788) : 



1 M YRRKGRG IK PWMGAGAAFA ALVWLVYALG DTLTPFAVAA VLAYVLDPLV 

51 EWLQKKGLNR ASASMS VMVF SLILLLALLL IIV PMLVGQF NNLASRLPQL 

101 IGFMQNTLLP WLKNTIGGYV EIDQAS I I AW FQAHTGELSN ALKAWFPVLM 

151 KQGGNIVS TI GNLLLPPLLL YYFLL DWHRW SCGIPKLVPR RFAGAYTRIT 

201 GNLNKVWGKF LRGQLLGETE RGAWCRVGR ECWEGGGARS RPSDDGWPRW 

251 GGG* 

Further work revealed the following gonoccocal DNA sequence [<SEQ ID 789>] (SEP ID NO: 
789) : 



1 ATGTATCGGA GAAAAGGACG GGGCATCAAG CCGTGGATGG GTGCCGGCGC 

51 GGCGTTTGCC GCCTTGGTCT GGCTGGTTTA CGCGCTCGGC GATACTTTGA 

101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTGTTGGA CCCTTTGGTC 

151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT 

2 01 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATTGTCC 

2 51 CTATGCTGGT CGGGCAGTTC AATAATTTGG CATCTCGCCT GCCCCAATTA 
301 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG 

3 51 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG TTTCAGGCGC 
401 ATACGGGCGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG 
451 AAACAGGGCG GCAATATTGT CAGCAGTATC GGCAACCTGC TGCTGCCGCC 
501 CTTGCTGCTT TACTATTTCC TGCTGGATTG GCAGCGGTGG TCGTGCGGCA 
551 TCGCCAAACT GGTTCCGAGG CGTTTTGCCG GTGCTTATAC GCGCATTACG 
601 GGTAATTTGA ACGAGGTATT GGGCGAATTT TTGCGCGGTC AGCTTCTGGT 
651 GATGCTGATT ATGGGCTTGG TTTACGGTTT GGGATTGATG CTAGTCGGAC 
701 TGGATTCGGG ATTTGCCATC GGTATGGTTG CCGGTATTTT GGTGTTTGTC 
751 CCCTATTTGG GTGCGTTTAC GGGATTGCTG CTTGCCACTG TTGCAGCCTT 
801 GCTCCAGTTC GGTTCGTGGA ACGGAATCTT GGCTGTTTGG GCGGTTTTTG 
851 CCGTCGGTCA GTTTCTCGAA AGTTTTTTCA TTACGCCGAA AATTGTAGGA 
901 GACCGTATCG GCCTGTCGCC GTTTTGGGTT ATCTTTTCGC TGATGGCGTT 
951 CGGAGAGCTG ATGGGCTTTG TCGGAATGTT GGCCGGATTG CCTTTGGCCG 

1001 CCGTAACCTT GGTCTTGCTT CGCGAGGGCG CGCAGAAATA TTTTGCCGGC 

1051 AGTTTTTACC GGGGCAGGTA G 

This corresponds to the amino acid sequence [<SEQ ID 790; PRF121ng-l>] (SEP ID NP: 790; 
PRF121ng-l) : 



i 

51 
101 



M YRRKGRG IK PWMGAGAAFA ALVWLVYALG DTLTPFAVAA VLAYVLDPLV 
EWLQKKGLNR ASASMS VMVF SLILLLALLL IIV PMLVGQF NNLASRLPQL 
IGFMQNTLLP WLKNTIGGYV EIDQASIIAW FQAHTGELSN ALKAWFPVLM 
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151 KQGGNIVS SI GNLLLPPLLL YYFLL DWQRW SCGIAKLVPR RFAGAYTRIT 
201 GNLNEVLGEF LRGQL LVMLI MGLVYGLGLM LV GLDSGFAI GMVAG ILVFV 
251 PYLGAFTGLL LA TVAALLQF GSWNG ILAW AVFAVGQFLE SF FITPKIVG 
301 DRIGLSPFWV IFSLMAFGEL MGF VGMLAGL PLAAVTLVLL REGAQKYFAG 
351 SFYRGR* 

ORF121ng-l (SEP ID NO: 790) and ORF121-1 (SEP ID NO: 784) show 97.5% identity in 356 aa 
overlap: 



10 20 30 40 50 60 

10 orf 121-1 .pep MYRRKGRGIKPWMGAGAAFAALWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR 

i 1 1 1 1 1 1 1 1 1 1 II I II 1 1 1 1 M I MM 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 II I II 1 1 1 1 1 

orfl21ng-l " M YRRKGRG I KPWMGAGAAFAAL VWLVYALGDTLTP FAVAAVLAYVLD PLVEWLQ KKGLNR 

10 20 30 40 50 60 



70 80 90 100 110 120 

15 orf 121-1. pep ASASMSVIWFSLILLT^LLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 

1 I I I I I I I i I I I T I I I I I I I I I I I I I I > I I I I I Mill I I I I I I I I I I I I 

orf 121ng-l ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 

70 80 90 100 110 120 



130 140 150 160 170 180 

20 orf 121-1. pep EIDQAS I I AWLQAHTGELSNALKAWFPVLMRQGGNI VSS IGNLLLLPLLLYYFLLDWQRW 

Ml M 1 1 1 1 M II II I II 1 1 1 1 M 1 1 1 M I II I Ml 1 1 1 1 M I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 121ng-l EIDQAS I I AWFQAHTGELSNALKAWFPVLMKQGGN I VSS I GNLLLPPLLL YYFLLDWQRW 

130 140 150 160 170 180 



190 200 210 220 230 240 

25 orf 12 1 - 1 . pep SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI 

III 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I II I M II II 1 1 1 1 Ml 1 1 1 III I 

orf 121ng-l SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLMLVGLDSGFAI 

190 200 210 220 230 240 



250 260 270 280 290 300 

30 orf 12 1 - 1 . pep GMLAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILSVWAVFAVGQFLESFFITPKI VG 

Ml 1 1 1 1 M 1 1 M 1 1 II 1 1 1 1 1 Ml 1 1 1 II I M II MM M I II 1 1 1 1 II 1 1 1 1 II 1 1 

orf 121ng-l GMVAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILAVWAVFAVGQFLESFFITPKIVG 

250 260 270 280 290 300 



310 320 330 340 350 

35 or f 12 1 - 1 . pep DRIGLS P FWV I FS LMAFGQLMGFVGMLAGLPLAAVTLVLLREGVQKYFAGS FYRGRX 

I M 1 1 1 1 1 M M M 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 Ml i I II Ml 1 1 1 M II 1 1 1 II 

orf 121ng-l DRIGLSPFWVIFSLMAFGELMGFVGMLAGLPLAAVTLVLLREGAQKYFAGSFYRGRX 

310 320 330 340 350 



40 In addition, ORF121ng-l (SEP ID NO: 790) shows homology to a permease fSEO ID NP: 1162) 
from H. influenzae : 

sp|P43969|PERM_HAEIN PUTATIVE PERMEASE PERM HOMOLOG Length = 349 
Score =69.9 bits (168), Expect = 2e-ll 

Identities = 67/317 (21%), Positives = 120/317 (37%), Gaps = 7/317 (2%) 



45 



Query: 26 
Sbjct: 32 



VYALGDTLTPFAVAAVLAYVLDPLVEWL- QKKGLNRASASMSVMVFSXXXXXXXXXXXVP 84 
+Y GD + P +A VL+Y+L+ + +L Q R A++ + VP 

IYFFGDLIAPLLIALVLSYLLEIPINFLNQYLKCPRMLATILIFGSFIGLAAVFFLVLVP 91 
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Query: 


85 


MLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYVE- IDQASI IAWFQAHTGELSNALK 


14 3 






ML Q +L S LP + N WL N YEID+++F+ ++ + 




Sbjct: 


92 


MLWNQTISLLSDLPAMF NKSNEWLLNLPKNYPELIDYSMVDSIFNSVREKILGFGE 


147 


Query : 


1 A A 

144 




203 






+ + + N+VS D G+++ +P+ A+ R + 




Sbjct: 


148 


SAVKLSLAS IMNLVSLGI YAFLVPLMMFFMLKDKSELLQGVSRFLPKNRNLAFXRWK- EM 


206 


Query: 


204 


NEVLGE FLRGQXXXXXXXXXXXXXXXXXXXXDSGFA I GMVAG I LVFVP YXXXXXXXXXXX 


263 






+ + + + Q+ + + G+ V VPY 




Sbjct: 


207 


QQQ I SNY I HGKLLE I L I VTL I T Y I I FL I FGLNYPLLLAFAVGLS VLVP Y I GAV I VT I P VA 


266 


Query: 


264 


XXXXXQFGSWNGILAVWAVFAVGQFLESFFITPKIVGDRIGLSPFWVIFSLMAFGELMGF 


323 






QFG + FAV QL+ +P+ ++LP +1 S++ FG L GF 




Sbjct: 


267 


LVALFQFGISPTFWYI I IAFAVSQLLDGNLLVPYLFSEAVNLHPLI I I ISVLI FGGLWGF 


326 


Query: 


324 


VGMLAGLPLAAVTLVLL 34 0 








G+ +PLA + ++ 




Sbjct: 


327 


WGVFFAI PLATLVKAVI 343 





Based on this analysis, including the presence of a putative, leader sequence and transmembrane 
domains in the two proteins, it is predicted that the proteins from N. meningitidis and 
N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
raising antibodies. 



Example 94 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 79 1>] (SEP ED 
NO: 791) : 

1 . . ACTGCTTTTT CGGCGGCGCT GCGCTTGAGT CCATCATGAC TCGTCATATT 

51 TTTGTCCTTT GGGAAACCGT ATCAACAAAC AGCCGCCATC TTAACATTTT 

101 TTTGCACGTC CTGCCCGCCG CGTTCAAATG CGTACCAGCA ATACCGCCGC 

151 CTGCGCCTCT ATGCCTTCCA TCCGCCCGAG ATAGCCGAGT TTTTCGTTGG 

201 TTTTGCCTTT GATGTTGACG CACGAAATGT CTATGCCCAA ATCGGCGGCG 

251 ATGTTGGCAC GCATTTGCGG AATGTGCGGC GCGAGTGTGG GTTTCTGTGC 

301 AATCACGGTC GTATCGACAT TGACCGCCTG CCAACCCTGC GCCTGAACGC 

3 51 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT 

4 01 GCGGCGGTGT CGGGGAAATG GCTGCCGATA TCGCCCAAAC CTGCCGCACC 
4 51 GAGCAGCGCG TCGGTAACGG CGTGCAGCAG CGCATCGGCA TCGGAGTGTC 
501 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAG. . 

This corresponds to the amino acid sequence [<SEQ ID 792; ORF122>] (SEP ID NO: 792; 
ORF122): 



1 . .TAFSAALRLS PSXLVIFLSF 

51 LRLYAFHPPE IAEFFVGFAF 

101 NHGRIDIDRL PTLRLNALIR 

151 EQRVGNGVQQ RIGIGVSEQP 



GKPYQQTAAI LTFFCTSCPP RSNAYQQYRR 
DVDARNVYAQ IGGDVGTHLR NVRRECGFLC 
RTQKDAAVRI FELCGGVGEM AADIAQTCRT 
FFKWDFNSAK YQ. . 
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Further work revealed the complete nucleotide sequence [<SEQ ID 793>] (SEP ID NO: 793) : 



1 ATATCGTACT GGGCAAGCAG TTCGCCGGAT TTTTTGGAAG TAGATACCGC 

51 GCCTTTGATT TTTTTGCCGC TCTTACCCAA GGCTTCGATG AAAAAGTTGA 

101 TGGTCGAGCC GGTACCGATG CCGATATATT CATTTTCGGG TACGAATTCG 

151 ACTGCTTTTT CGGCGGCGAT GCGCTTGAGT TCGTCTTGTG TCGTCATATT 

201 TTTGTCCTTT GGGAAACCGT ATCAACAAAC AGCCGCCATC TTAACATTTT 

251 TTTGCACGTC CTGCCCGCCG CGTTCAAATG CGTACCAGCA ATACCGCCGC 

301 CTGCGCCTCT ATGCCTTCCA TCCGCCCGAG ATAGCCGAGT TTTTCGTTGG 

351 TTTTGCCTTT GATGTTGACG CACGAAATGT CTATGCCCAA ATCGGCGGCG 

4 01 ATGTTGGCAC GCATTTGCGG AATGTGCGGC GCGAGTTTGG GTTTCTGTGC 

451 AATCACGGTC GTATCGACAT TGACCGCCTG CCAACCCTGC GCCTGAACGC 

501 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT 

551 GCGGCGGTGT CGGGGAAATG GCTGCCGATA TCGCCCAAAC CTGCCGCACC 

601 GAGCAGCGCG TCGGTAACGG CGTGCAGCAG CGCATCGGCA TCGGAGTGTC 

651 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAGCTTT 

701 CTGCCTTCGG TCAGTTGGTG GACATCGTAG CCCTGTCCGA TACGGATGTT 

751 CGTCATCGTT TGTGTTCCTG A 



This corresponds to the amino acid sequence [<SEQ ID 794; ORF122-l>] (SEP ID NO: 794; 
ORF122-1) : 



1 ISYWASSSPD FLEVDTAPLI FLPLLPKASM KKLMVEPVPM PIYSFSGTNS 

51 T AFSAAMRLS SSCWIFL SF GKPYQQTAAI LTFFCTSCPP RSNAYQQYRR 

101 LRLYAFHPPE IAEFFVGFAF DVDARNVYAQ IGGDVGTHLR NVRREFGFLC 

151 NHGRIDIDRL PTLRLNALIR RTQKDAAVRI FELCGGVGEM AADIAQTCRT 

201 EQRVGNGVQQ RIGIGVSEQP FFKWDFNSAK YQLSAFGQLV DIVALSDTDV 

251 RHRLCS * 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N.meningitidis (strain A) 

PRF122 (SEP ID NO: 792) shows 94.0% identity over a 182aa overlap with an ORF (PRF122a) 
(SEP ID NO: 796) from strain A of N. meningitidis: 



10 20 30 

orf 122 .pep TAFSAALRLSPSXLV I FLS FGKP YQQTAAI 

Illllhlll I H I M II I I M I I I i I 
orf 122a FLPLLPKASMKKLMVEPVPMPMYS FSGTNSTAFSAAMRLSS S CWI FLS FGKP YQQTAAI 

30 40 50 60 70 80 



40 50 60 70 80 90 

orf 122 . pep LT FFCTSCPPRSNAYQQYRRLRLYAFHPPE I AEFFVGFAFDVDARNVYAQ IGGDVGTHLR 

1 1 1 1 MINIM MMMM Ml MMIIIMI 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 122a LTFFXTSCPPRSNPYQQYRRLRLYAFHAPE I TEFFVGFAFXVDARNVYAQ IGGDVGTHLR 

90 100 110 120 130 140 



100 110 120 130 140 150 

orf 122 . pep NVRRECGFLCNHGR ID I DRLPTLRLNAL I RRTQKDAAVR I FELCGGVGEMAAD I AQTCRT 

hill M M I M I M M M M M M M M II M M I M 1 1 1 II I M M M I M M M 1 1 

orf 122a NMRRE FGFLCNHGR ID IDRLPTLRLNAL I RRTQKDAAVR I FELCGGVGEMAAD I AQTCRT 

150 160 170 180 190 200 
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160 170 180 

orf 122 . pep EQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQ 

I M I I I I I I M I : I ! I I I I I I I I I I I I I 
orf 122a EQRVGNGVQQR I G I GVSEQP FFKWDFNS AKYQLS AFGQLVD I VALSDTDVRHRLCSX 

5 210 220 230 240 250 

The complete length ORF122a nucleotide sequence [<SEQ ID 795>] (SEP ID NO: 795) is: 



1 ATATCATATT GGGCAAGCAG TTCACTGGAT TTTTTGGAAG TAGATACCGC 

51 GCCTTTGATT TTTTTGCCGC TCTTACCCAA GGCTTCGATG AAAAAGTTGA 

10 101 TGGTCGAACC GGTACCGATG CCGATGTATT CGTTTTCGGG TACGAATTCG 

151 ACTGCNTTTT CGGCGGCGAT GCGCTTGAGT TCGTCTTGTG TCGTCATATT 

2 01 TTTGTCCTTT GGGAAACCGT ATCAACAAAC AGCCGCCATC TTAACATTTT 

2 51 TTNNNACGTC CTGCCCGCCG CGTTCAAATC CTTACCAGCA ATACCGCCGC 

301 CTGCGACTCT ATGCCTTCCA TGCGCCCGAG ATAACCGAGT TTTTCGTTGG 

15 3 51 TTTTGCCTTT GANGTTGACG CACGAAATGT CTATGCCCAA ATCGGCGGCG 

4 01 ATGTTGGCAC GCATTTGCGG AATATGCGGC GCGAGTTTGG GTTTCTGTGC 

4 51 AATCACGGTC GTATCGACAT TGACCGCCTG CCAACCCTGC GCCTGAACGC 

501 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT 

551 GCGGCGGTGT CGGGGAAATG GCTGCCGATA TCGCCCAAAC CTGCCGCACC 

20 601 GAGCAGCGCG TCGGTAACGG CGTGCAGCAG CGCATCGGCA TCGGAGTGTC 

651 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAGCTTT 

701 CTGCCTTCGG TCAGTTGGTG GACATCGTAG CCCTGTCCGA TACGGATGTT 

751 CGTCATCGTT TGTGTTCCTG A 

25 This encodes a protein having amino acid sequence [<SEQ ID 796>] fSEO ID NO: 796) : 



1 ISYWASSSLD FLEVDTAPLI FLPLLPKASM KKLMVEPVPM PMYSFSGTNS 
51 T AFSAAMRLS SSCWIFL SF GKPYQQTAAI LTFFXTSCPP RSNPYQQYRR 
101 LRLYAFHAPE ITEFFVGFAF XVDARNVYAQ IGGDVGTHLR NMRREFGFLC 
151 NHGRIDIDRL PTLRLNALIR RTQKDAAVRI FELCGGVGEM AADIAQTCRT 
30 2 01 EQRVGNGVQQ RIGIGVSEQP FFKWDFNSAK YQLSAFGQLV DIVALSDTDV 

251 RHRLCS* 

PRF122a (SEP ID NO: 796) and ORF122-1 (SEP ID NO: 794) show 96.9% identity in 256 aa 
overlap: 

35 10 20 30 40 50 60 

orf 122a .pep 

orf 122-1 



40 

orf 122a .pep 
orf 122-1 



ISYWASSSLDFLEVDTAPLIFLPLLPKASMKKLMVEPVPMPMYSFSGTNSTAFSAANRLS 

llllll I MM M Nil 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 hi Ml Mi M II 1 1 1 ! II 

I S YWAS S S PDFLEVDTAPL I FLPLLPKASMKKLMVEPVPMP I YSFSGTNS TAFSAAMRLS 
10 20 30 40 50 60 

70 80 90 100 110 120 

SSCWIFLSFGKPYQQTAA I LTFFXTSCPPRSNPYQQYRRLRLYAFHAPE ITEFFVGFAF 

1 1 i M 1 1 It I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml II I lllllllllllll IMMIIMM 

SSCWIFLSFGKPYQQTAAILTFFCTSCPPRSNAYQQYRRLRLYAFHPPEIAEFFVGFAF 
70 80 90 100 110 120 



45 130 140 150 160 170, 180 

orf 122a . pep XVDARNVYAQ I GGDVGTHLRNMRREFGFLCNHGRID I DRLPTLRLNAL I RRTQKDAAVRI 

III I MM Mill I MM I M I II 1 1 M 1 1 M M 1 1 1 1 1 1 M 1 1 IMIIIIIII.il 

orf 122-1 DVDARNVYAQ I GGDVGTHLRNVRREFGFLCNHGRID I DRLPTLRLNAL I RRTQKDAAVR I 

130 140 150 160 170 180 

50 190 200 210 220 230 240 
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orf 122a . pep FELCGGVGEMAADIAQTCRTEQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQLSAFGQLV 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIMIIIIIIIIIIIII 

orf 122 - 1 FELCGGVGEMAAD I AQTCRTEQRVGNGVQQRI G I GVS EQPFFKWDFNS AKYQLS AFGQLV 

190 200 210 220 230 240 

5 250 

orf 122a .pep DIVALSDTDVRHRLCSX 

Illllllllllllllll 
orfl22-l DIVALSDTDVRHRLCSX 

250 

10 Homology with a predicted ORF from N. gonorrhoeae 

ORF122 (SEP ID NO: 792) shows 89.6% identity over a 182 aa overlap with a predicted ORF 
(ORF122ng) (SEP ID NO: 798) from N. gonorrhoeae: 

orf 122. pep TAFSAALRLS PSXLVI FLS FGKP YQQTAA I 30 

Mlllhlll I :|lllllllllllllll 
1 5 orf 122ng FLPLLPKASMKKLMVEPVPMPMYSFSGTNSTAFSAAMRLSSSCVVI FLS FGKP YQQTAA I 80 

orf 122 . pep LTFFCTSCPPRSNAYQQYRRLRLYAFHPPEIAEFFVGFAFDVDARNVYAQIGGDVGTHLR 90 

III I III Mill II 1 1 III II I II II II II I II II II Ihlllh Ml II 1 1 MM I 

orf 122ng LTFFCTSWPPRSNPYQQYRRLRLYAFHPPE lAEFFVGFAFD IDARNIDTQIGGDVGTHLR 140 

orf 122 .pep NWRECGFLCNHGRID IDRLPTLRLNAL I RRTQKDAAVR I FELCGGVGEMAAD I AQTCRT 150 

20 1 1| | || 1 1 1| || 1 1 1 h II II 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 1 1 h II I h I II 1 1 1 

orf 122ng NVRCE FGFLCNHGR ID I DHLPTLRLNAL I RRTQKDAAVR I FELCGGVGKMAADVAQTCRT 200 

orf 122 .pep EQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQ 182 

llllllllllhll : I I I I I I ! I I I I I I I 
orf 122ng EQRVGNGVQQRVGIRMPEQPFFKWDFNSAKYQLSAFGQLVDIVALSDTDIRHRLCS 256 

The complete length PRF122ng nucleotide sequence [<SEQ ID 797>] (SEPIDNP: 797) is: 



25 



1 ATGTCGTACC GGGCAAGCAG TTCGCCGGAT TTTTTGGAGG TTGAAACCGC 

51 GCCTTTGATT TTTTTACCGC TTTTGCCCAA GGCTTCGATG AAGAAATTGa 

101 tgGTCGAACC GgtaCCGATG CCGATGTATT CGTTTTCGGG TACGAATTCG 

30 151 ACTGCTTTTT CGGCGGCGAT GCGCttgAgt TCgtcttgcg TcgTCATATT 

201 TTTAtccttt gGGAAaccct atcaAcaAAc agccgccatC TTAACATTTT 

251 TTTGCACGtc ctggccgccg cgttcaAATc cgtaccaGca ataccgccgc 

301 ctgcgcctCT AtgcCTTCCA TCCGCCCGAG ATAGCCGAGT TTTTCGTTGG 

351 TTTTGCCTTT GATatTGACG CACGAAATAT CGatacCCAa atcggcgGCG 

35 401 ATGTTGGCAC GCATTTGCGG AATGTGCGGT GCGAGTTTGG GTTTCTGTGC 

4 51 AATCACGGTC GTATCGACAT TGACCACCTG CCAACCCTGC GCCTGAACGC 

501 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT 

551 GCGGCGGTGT CGGGAAAATG GCTGCCGATG TCGCCCAAAC CTGCCGCACC 

601 GAGCAGCgcg tcggtaaCGG CGTGCAGCAG cgcgTcgGCA TCCGAATGCC 

40 651 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAGCTTT 

701 CTGCCTTCGG TCAATTGGTG GACATCGTAG CCCTGTCCGA TACGGATATT 

751 CGTCATCGTT TGTGTTCCTG A 

This encodes a protein having amino acid sequence [<SEQ ID 798>] (SEP ID NP: 798) : 



45 1 MSYRASSSPD FLEVETAPLI FLPLLPKASM KKLMVEPVPM PMYSFSGTNS 
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51 T AFSAAMRLS SSCWIFL SF GKPYQQTAAI LTFFCTSWPP RSNPYQQYRR 

101 LRLYAFHPPE IAEFFVGFAF DIDARNIDTQ IGGDVGTHLR NVRCEFGFLC 

151 NHGRIDIDHL PTLRLNALIR RTQKDAAVRI FELCGGVGKM AADVAQTCRT 

201 EQRVGNGVQQ RVGIRMPEQP FFKWDFNSAK YQLSAFGQLV D I VALSDTD I 

251 RHRLCS* 



ORF122ng (SEP ID NO: 798) and ORF122-1 (SEP ID NO: 794) show 92.6% identity in 256 aa 
overlap: 



10 20 30 40 50 60 

10 orf 122-1 .pep ISYWASSSPDFLEVDTAPLIFLPLLPKASMKKLMVEPVPMPIYSFSGTNSTAFSAAMRLS 

:|| II II I II II hi Mill II MINIMI I MM MINI III Mill II MINI 

orf 122ng MS YRASSSPDFLEVETAPL I FLP LLP KASMKKLMVEPVPMPMYSFSGTNS TAFSAAMRLS 

10 20 30 40 50 60 



70 80 90 100 110 120 

15 orf 122-1 .pep SSCWIFLSFGKPYQQTAAILTFFCTS CP PRSNAYQQYRRLRLYAFHPPE IAEFFVGFAF 

. 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 U 1 1 1 1 1 1 1 Mill II I II Ml Mill 1 1 III II I II II I 

orf 122ng SSCWIFLSFGKPYQQTAAILTFFCTSWPPRSNPYQQYRRLRLYAFHPPE IAEFFVGFAF 

70 80 90 • 100 110 120 



130 140 150 160 170 180 

20 orf 122 - 1 . pep DVDARNVYAQ I GGDVGTHLRNVRREFGFLCNHGR ID I DRLPTLRLNAL I RRTQKDAAVRI 

hlllh ^ II I I i I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I .1 I I I 
orf 122ng DIDARNIDTQIGGDVGTHLRNVRCEFGFLCNHGRIDIDHLPTLRLNALIRRTQKDAAVRI 

130 140 150 160 170 180 



25 orf 122-1. pep 

orf 122ng 

30 orf 122-1. pep 

orf 122ng 

250 

35 Based on this analysis, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 95 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 799>] (SEP ID 
NP: 799) : 



190 200 210 220 230 240 

FELCGGVGEMAADIAQTCRTEQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQLSAFGQLV 

I | I I I I I h II I h I I I II I I I I I I I I I II h II : I I II I I I I I I I I M I I I I I I I I I 
FELCGGVGKMAADVAQTCRTEQRVGNGVQQRVGIRMPEQPFFKWDFNSAKYQLSAFGQLV 

190 200 210 220 230 240 

250 

D I VALSDTD VRHRLCSX 

MllllllhlllMII 

D I VALSDTD I RHRLCSX 



40 1 . . GCCGGCGCGA GTGCGAACAA 

51 CGCTGTCAGC GTTACCCTGA 

101 TTACCGAATA TGAAAACTTC 

151 ATGGGGCGGA TTTTGATTGC 



CATTTCCGCG CGTTTTGCGG AAACACCCGT 
TCGGCACGGT ACTTGCCGTC ATGCTGCCCG 
CTGCTGCTTA TCGGCTCGGT ATTTGCGCCG 
CGACTTTTTC GTCTTGAAAC GGCGTGA 
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This corresponds to the amino acid sequence [<SEQ ID 800; ORF125>] (SEP ID NO: 800; 
PRF125) : 

1 . . AGASANNISA RFAETPVAVS VTLIGTVLAV MLPVTEYENF LLLIGSVFAP 

51 MGGFDCRLFR LETA* 

5 

Further work revealed the complete nucleotide sequence [<SEQ ID 80 1>] (SEP ID NO: 801) : 

1 ATGTCGGGCA ATGCCTCCTC TCCTTCATCT TCCTCCGCCA TCGGGCTGAT 

51 TTGGTTCGGC GCGGCGGTAT CGATTGCCGA AATCAGCACG GGTACGCTGC 

101 TTGCGCCTTT GGGCTGGCAG CGCGGTCTGG CGGCTCTACT TTTGGGTCAT 

10 151 GCCGTCGGCG GCGCGCTGTT TTTTGCGGCG GCGTATATCG GCGCACTGAC 

2 01 CGGACGCAGC TCGATGGAAA GCGTGCGCCT GTCGTTCGGC AAACGCGGTT 
251 CAGTGCTGTT TTCCGTGGCG AATATGCTGC AACTGGCCGG CTGGACGGCG 
301 GTGATGATTT ACGCCGGCGC AACGGTCAGC TCCGCTTTGG GCAAAGTGTT 

3 51 GTGGGACGGC GAATCTTTTG TCTGGTGGGC ATTGGCAAAC GGCGCGCTGA 
15 4 01 TTGTGCTGTG GCTGGTTTTC GGCGCACGCA AAACAGGCGG GCTGAAAACC 

4 51 GTTTCGATGC TGCTGATGCT GTTGGCGGTT CTGTGGCTGA GTGCCGAAGT 
501 CTTTTCCACG GCAGGCAGCA CCGCCGCACA GGTTTCAGAC GGCATGAGTT 
551 TCGGAACGGC AGTCGAGCTG TCCGCCGTGA TGCCGCTTTC CTGGCTGCCG 
601 CTTGCCGCCG ACTACACGCG CCACGCGCGC CGCCCGTTTG CGGCAACCCT 

20 651 GACGGCAACG CTCGCCTACA CGCTGACCGG CTGCTGGATG TATGCCTTGG 

701 GTTTGGCAGC GGCGTTGTTC ACCGGAGAAA CCGACGTGGC AAAAATCCTG 

751 CTGGGCGCAG GTTTGGGTGC GGCAGGCATT TTGGCGGTCG TCCTCTCCAC 

801 CGTTACCACA ACGTTTCTCG ATGCCTATTC CGCCGGCGCG AGTGCGAACA 

. 851 ACATTTCCGC GCGTTTTGCG GAAACACCCG TCGCTGTCGG CGTTACCCTG 

25 901 ATCGGCACGG TACTTGCCGT CATGCTGCCC GTTACCGAAT ATGAAAACTT 

951 CCTGCTGCTT ATCGGCTCGG TATTTGCGCC GATGGCGGCG GTTTTGATTG 

1001 CCGACTTTTT CGTCTTGAAA CGGCGTGAGG AGATTGAAGG CTTTGACTTT 

1051 GCCGGACTGG TTCTGTGGCT TGCGGGCTTC ATCCTCTACC GCTTCCTGCT 

1101 CTCGTCCGGC TGGGAAAGCA GCATCGGTCT GACCGCCCCC GTAATGTCTG 

30 1151 CCGTTGCCAT TGCCACCGTA TCGGTACGCC TTTTCTTTAA AAAAACCCAA 

12 01 TCTTTACAAA GGAACCCGTC ATGA 

This corresponds to the amino acid sequence [<SEQ ID 802; PRF125-1>] (SEP ID NP: 802: 
PRF125-1) : 

35 1 MSGNASSPSS SSAIGLIWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH 

51 AVGGA LFFAA AYIGALTGRS SMESVRLSFG KRGSVLFSVA NMLQLAGWTA 

101 VMIYAGATVS SALGKVLWDG ES FVWWALAN GALIVLWLV F GARKTGGLKT 

151 VS MLLMLLAV LWLSAEVF ST AGSTAAQVSD GMSFGTAVEL SAVMPLSWLP 

2 01 LAADYTRHAR RPFAATLTAT LAYTLTGCWM YALGLAAALF TGETDVAKIL 
40 251 LGAGLGAAGI LAWL STVTT TFLDAYSAGA SANNISARFA E TPVAVGVTL 

3 01 IGTVLAVMLP VTEYEN FLLL IGSVFAPMAA VLIA DFFVLK RREEIEGFDF 
351 AGLVLWLAGF ILYRFLLSSG WESSIGLTAP VMSAVAIATV SVRLFFKKTQ 

4 01 SLQRNPS* 

45 Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N. meningitidis (strain A) 
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ORF125 (SEP ID NO: 800) shows 76.5% identity over a 51aa overlap with an ORF (ORF125a) 
(SEP ID NO: 804) from strain A of N. meningitidis: 



10 20 30 

AGAS ANNI S ARFAETP VAVS VTL I GTVLAV 
I = I I = I = = = I I = I I I 

KI LLGAGLGAAGI LAWLSTVTTTFLDAYS AGVS ANNI SAKLSE I P I AVAVAWGTLLAV 
250 260 270 280 290 300 

40 50 60 

MLPVTEYENFLLLIGSVFAPMGGFDCRLFRLETAX 
: I I I I I M I I I I I I I I I I I M : ' 

LLPVTEYENFLLLIGSVFAPMAAVLIADFFVLKRREEIEG 
310 320 330 340 



orf 125 .pep 
orf 125a 

orf 125 .pep 
orf 125a 



The PRF125a partial nucleotide sequence [<SEQ ID 803>] (SEP ID NP: 803) is: 



1 ATGTCGGGCA ATGCCTCCTC TCNTTCATCT TCCGCCGCCA TCGGGCTGAT 

51 TTGGTTCGGC GCGGCGGTAT CGATTGCCGA AATCAGCACG GGTACACTGC 

101 TTGCGCCTTT GGGCTGGCAG CGCGGTCTGG CNGCTCTGCT TTTGGGTCAT 

151 GCCGTCGGCG GCGCGCTGTT TTTTGCGGCG GCGTATATCG GCGCACTGAC 

201 CGGACNCANC TCGATGGAAA GCGTGCGCCT GTCGTTCGGC AAACGCGGTT 

251 CAGTGCTGTT TTCCGTGGCG AATATGCTGC AACTGGCCGG CTGGACGGCG 

3 01 GTGATGATTT ACGCCGGCGC AACGGTCAGC TCCGCTTTGG GCAAAGTGTT 

3 51 GTGGGACGGC GAATCTTTTG TCTGGTGGGC ATTGGCAAAC GGCGCGCTGA 

4 01 TTGTGCTGTG GCTGGTTTTC GGCGCACGCA AAACAGGCGG GCTGAAAACC 
451 GTTTCGATGC TGCTGATGCT GTTGGCGGTT CTGTGGCTGA GTGCCGAANT 
501 NTTTTCCACG GCAGGCAGCA CCGCCGCANN GGTNNCAGAC GGCATGAGTT 
551 TCGGAACGGC AGTCGAGCTG TCCGCCGTNA TGCCGCTTTC TTGGCTGCCG 
601 CTGGCCGCCG ACTACACGCG CCACGCGCGC CGCCCGTTTG CGGCAACCCT 
651 GACGGCAACG CTCGCCTACA CGCTGACCGG CTGCTGGATG TATGCCTTGG 
701 GTTTGGCAGC GGCGTTGTTC ACCGGAGAAA CCGACGTGGC AAAAATCCTG 
751 CTGGGCGCAG GTTTGGGTGC GGCAGGCATT TTGGCGGTCG TCCTGTCGAC 
801 CGTTACCACC ACTTTTCTCG ATGCNTACTC CGCCGGCGTA AGTGCCAACA 
851 ATATTTCCGC CAAACTTTCG GAAATACCNA TCGCCGTTGC CGTCGCCGTT 
901 GTCGGCACAC TGCTTGCCGT CCTCCTGCCC GTTACCGAAT ATGAAAACTT 
951 CCTGCTGCTT ATCGGCTCGG TATTTGCGCC GATGGCGGCG GTTTTGATTG 

1001 CCGACTTTTT CGTCTTGAAA CGGCGTGAGG AGATTGAAGG C.. 



This encodes a protein having the partial amino acid sequence [<SEQ ID 804>] (SEP ID NP: 
804) : 



1 MSGNASSXSS SAAIGLIWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH 

51 AVGGA LFFAA AYIGALTGXX SMESVRLSFG KRGSVLFSVA NMLQLAGWTA 

101 VMIYAGATVS SALGKVLWDG ES FVWWALAN GALIVLWLV F GARKTGGLKT 

151 VS MLLMLLAV LWLSAEXF ST AGS TAAX VXD GMSFGTAVEL SAVMPLSWLP 

201 LAADYTRHAR RPFAATLTAT LAYTLTGCWM YALGLAAALF TGETD VAKIL 

251 LGAGLGAAGI LAWL STVTT TFLDAYSAGV SANNISAKLS E IPIAVAVAV 

301 VGTLLAVLLP VTEYENFLLL IGSVFAPMAA VLIADFFVLK RREEIEG . . 



PRF125a (SEP ID NP: 804) and PRF125-1 (SEP ID NP: 802) show 94.5% identity in 347 aa 



overlap: 
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10 



15 



20 



25 



30 



10 20 30 40 50 60 

orf 12 5a. pep MSGNASSXSSSAAIGLIWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA 
Illllll I I I M I I I I M M I I I I I I I I I II I I I II I I I I I I I M I I M II I I I I I I 
orf 12 5-1 MSGNASSPSSSSAIGLIWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 12 5a. pep AYIGALTGXXSMESVRLSFGKRGSVLFSVANMLQLAGWTAVMIYAGATVSSALGKVLWDG 

Illlllll 1 1 M 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 Ml 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 M I , 

orf 12 5 - 1 AYIGALTGRSSMESVRLSFGKRGSVLFSVANMLQLAGWTAVMI YAGATVSSALGKVLWDG 

70 80 90 100 110 120 

130 140 150 160 170 180 

or f 12 5a . pep ESFVWWALANGALIVLWLVFGARKTGGLKTVSMLLMLLAVLWLSAEXFSTAGSTAAXVXD 

I M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 I M M 1 1 1 1 M I ! 1 1 1 1 1 ! 1 1 : 1 1 I lllllllll I I 

orf 12 5-1 ESFVWWALANGALIVLWLVFGARKTGGLKTVSMLLMLLAVLWLSAEVFSTAGSTAAQVSD 

130 140 150 160 170 180 

190 200 210 220 230 240 

orf 12 5a . pep GMSFGTAVELSAVMPLSWLPLAADYTRHARRPFAATLTATLAYTLTGCWMYALGLAAALF 

M 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M M I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

or f 12 5 - 1 GMSFGTAVELSAVMPLSWLPLAADYTRHARRPFAATLTATLAYTLTGCWMYALGLAAALF 

190 200 210 220 230 240 

250 260 270 280 290 300 

orf 12 5a . pep TGETDVAKI LLGAGLGAAGI LAWLSTVTTTFLD AYS AGVSANNISAKLSE I PI AVAVAV 
II I M I I I I I M I II I I I I I I I I I I I I I M 1 I I M I M I I I I I MM MM- 
orf 12 5 - 1 TGETDVAKI LLGAGLGAAG I LAWLSTVTTTFLDAYS AGAS ANN I SARFAETPVAVGVTL 

250 260 270 280 290 300 

310 320 330 340 

orf 12 5a . pep VGTLLAVLLP VTEYENFLLL IGS VFAPMAAVL I ADFFVLKRREE I EG 

M Ml MM III Illllll MM lllllllllll II I Illllll 

orf 12 5 - 1 IGTVLAVMLPVTEYENFLLLIGSVFAPMAAVLIADFFVLKRREEIEGFDFAGLVLWLAGF 

310 320 330 340 350 360 



Homology with a predicted ORF from N. gonorrhoeae 

ORF125 (SEP ID NO: 800) shows 86.2% identity over a 65aa overlap with a predicted ORF 
(ORF1 25ng) (SEP ID NO: 806) from N. gonorrhoeae: 



30 



35 



40 



AGAS ANN I S ARFAET PVAVS VTL I GTVLAV 

MINIMI MM IIIMM Mill 

KI LLGAGLG I TG I LAWLS TVTTT FLDTYS AGAS ANN I SARFAE I P VAVGVTL I RTVLAV 3 08 



orf 12 5 .pep 
orf 125ng 
orf 125 .pep 
orf 125ng 

An PRF125ng nucleotide sequence [<SEQ ID 805>] (SEP ID NP: 805) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 806>] (SEPIDNP: 806) : 



MLPVTEYENFLLLIGS VFAPM - GGFDCRLFRLETA 64 

MMIIhlMIII MM Illlllll Ml 

MLPVTEYKNFLLLIRSVFGPMAGGFDCRLFCLKTA 343 



1 MSGNASSPSS SAAIGLVWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH 
51 AVGGALF FAA AYIGALTGRS SMESVRLSFG KCGSVLFSVA NMLQLAGWTA 
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101 VMIYVGATVS SALGKVLWDG ES FVWWALAN GALIVLWLV F GARRTGGLKT 

151 VS MLLMLLAV LWLSVEVFA S SGTNAAPAVS DGMTFGTAVE LSAVMPLSWL 

201 PLAADYTRQA RRPFAATLTA TLAYTLTGCW MYALGLAAAL FTGETDVAKI 

251 LLGAGLGITG IliAWL STVT TTFLDTYSAG ASANNISARF AE IPVAVGVT 

301 LIRTVLAVM L PVTEYKNFLL LIRSVFGPMA GGFDCRLFCL KTA* 

Further work revealed the following gonococcal DNA sequence [<SEQ ID 807>] (SEO ID NO: 
807) : 



1 ATGTCGGGCA ATGCCTCCTC TCCTTCATCT TCCGCCGCCA TCGGGCTGGT 

51 TTGGTTCGGC GCGGCGGTAT CGATTGCCGA AATCAGCACG GGTACGCTGC 

101 TCGCCCCCTT GGGCTGGCAG CGCGGTCTGG CGGCCCTGCT TTTGGGTCAT 

151 GCCGTCGGCG GCGCGCTGTT TTTTGCGGCG GCGTATATCG GCGCACTGAC 

2 01 CGGACGCAGC TCGATGGAAA GTGTGCGCCT GTCGTTCGGC AAATGCGGTT 

2 51 CAGTGCTGTT TTCCGTGGCG AATATGCTGC AACTGGCCGG CTGGACGGCG 

301 GTGATGATTT ACGTCGGCGC AACGGTCAGC TCCGCTTTGG GCAAAGTGTT 

351 GTGGGACGGC GAATCCTTTG TCTGGTGGGC ATTGGCAAAC GGCGCACTGA 

4 01 TCGTGCTGTG GCTGGTTTTC GGCGCACGCA GAACGGGCGG GCTGAAAACC 

4 51 GTTTCGATGC TGCTGATGCT GCTTGCCGTG TTGTGGTTGA GCGTCGAAGT 

501 GTTCGCTTCG TCCGGCACAA ACGCCGCGCC CGCCGTTTCA GACGGCATGA 

551 CCTTCGGAAC GGCAGTCGAA CTGTCCGCCG TCATGCCGCT TTCCTGGCTG 

601 CCGCTGGCCG CCGACTACAC GCGCCAAGCA CGCCGCCCGT TTGCGGCAAC 

651 CCTGACGGCA ACGCTCGCCT ATACGCTGAC GGGCTGCTGG ATGTATGCCT 

701 TGGGTTTGGC GGCGGCTCTG TTTACCGGAG AAACCGACGT GGCGAAAATC 

751 CTGTTGGGCG CGGGCTTGGG CATAACGGGC ATTCTGGCAG TCGTCCTCTC 

801 CACCGTTACC ACAACGTTTC TCGATACCTA TTCCGCCGGC GCGAGTGCGA 

851 ACAACATTTC CGCGCGTTTT GCGGAAATAC * CCGTCGCTGT CGGCGTTACC 

901 CTGATCGGCA CGGTGCTTGC CGTCATGCTG CCCGTTACCG AATATAAAAA 

951 CTTCCTGCTG CTTATCGGCT CGGTATTTGC GCCGATGGCG GCGGTTTTGA 

1001 TTGCCGACTT TTTCGTCTTA AAACGGCGTG AGGAGATTGA AGGCTTTGAC 

1051 TTTGCCGGAC TGGTTCTGTG GCTGGCAGGC TTCATCCTCT ACCGCTTCCT 

1101 GCTCTCGTCC GGTTGGGAAA GCAGCATCGG TCTGACCGCC CCCGTAATGT 

1151 CTGCCGTTGC CATTGCCACC GTATCGGTAC GCCTTTTCTT TAAAAAAACC 

1201 CAATCTTTAC AAAGGAACCC GTCATGA 

This corresponds to the amino acid sequence [<SEQ ID 808; ORF125ng-l>] (SEO ID NO: 808; 
ORF125ng-l) : 



1 MSGNASSPSS SAAIGLVWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH 
51 AVGGAL FFAA AYIGALTGRS SMESVRLSFG KCGSVLFSVA NMLQLAGWTA 
101 VMIYVGATVS SALGKVLWDG ES FVWWALAN GALIVLWLV F GARRTGGLKT 
151 VS MLLMLLAV LWLSVEVFA S SGTNAAPAVS DGMTFGTAVE LSAVMPLSWL 
201 PLAADYTRQA RRPFAATLTA TLAYTLTGCW MYALGLAAAL FTGETDVAKI 
251 LLGAGLGITG ILAWL STVT TTFLDTYSAG ASANNISARF AE IPVAVGVT 

3 01 LIGTVLAVM L PVTEYKN FLL LIGSVFAPMA AVLIA DFFVL KRREEIEGFD 
351 FAGLVLWLAG FILYRFLLSS GWESSIGLTA PVMSAVAIAT VSVRLFFKKT 

4 01 QSLQRNPS* 

ORF125ng-l (SEO ID NO: 808) and ORF125-1 (SEO ID NO: 802) show 95.1% identity in 408 aa 
overlap: 

10 20 30 40 50 60 

orf 125-1 .pep MSGNASSPSSSSAIGLIWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA 

1 1 1 i 1 1 1 1 1 1 1 : 1 M h M M 1 1 1 1 1 II 1 1 1 1 , 1 M I ' I i I M I M I 1 1 1 i ; M 1 1 M 
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orf 125ng-l MSGNASSPSSSAAIGLVWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA 

10 20 30 40 50 60 

70 80 90 100 110 120 

or f 12 5 - 1 . pep AYIGALTGRSSMESTOLSFGKRGSVLFSVANMLQLAGWTAVMIYAGATVSSALGKVLWDG 

5 1 1 1 1 1 1 1 1 I M 1 1 M 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 hi 1 1 II II 1 1 1 1 1 1 1 

orf 125ng- 1 AYIGALTGRSSMESTOLSFGKCGSVLFSVANMLQIAGWTAVMIYVGATVSSALGKVLWDG 

70 80 90 100 110 120 

130 140 150 160 170 179 

orf 125- 1 . pep ESFVWWALANGALIVLWLVFGARKTGGLKTVSMLLMLLAVLWLSAEVFSTAGSTAAQ- VS 

10 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M I U 1 1 1 1 II 1 1 II 1 1 1 1 1 M I h-l- 1 1 II 

orf 12 5ng-l ESFVWWALANGAL I VLWLVFGARRTGGLKTVSMLLMLIAVLWLS VEVFAS SGTNAAPAVS 

130 140 150 160 170 180 

180 190 200 210 220 230 239 

orf 125-1. pep DGMSFGTAVELSAVMPLSWLPLAADYTRHARRPFAATLTATLAYTLTGCWMYALGLAAAL 

15 1 1 1 : 1 1 1 1 ; I M Ml 1 : 1 M Mill 11 1 1 1 1 1 1 , M M I 

or f 1 2 5ng - 1 DGMTFGTAVELSAVMPLSWLPLAADYTRQARRPFAATLTATLAYTLTGCWMYALGLAAAL 

190 200 210 220 230 240 

240 250 260 270 280 290 299 

orf 125-1. pep FTGETDVAKILLGAGLGAAGILAWLSTVTTTFLDAYSAGASANNISARFAETPVAVGVT 

20 | | | | | | | | | | | | | | | | | : | M || | M M | | | | | | : I I I I I I I I I M I I I M I I I I I I I 

orf 125ng-l FTGETDVAKILLGAGLGITGILAWLSTVTTTFLDTYSAGASANNISARFAEIPVAVGVT 

250 260 270 280 290 300 

300 310 320 330 340 350 359 

orf 12 5 - 1 . pep L I GTVLAVMLP VTE YENFLLL I GS VFAPMAAVL I ADFFVLKRREE I EGFDFAGLVLWLAG 

25 || | | | | | | | | | | | | | : | | | | | | | | | | | | | | | | | | || I I I I I I I I I I I I I I I I I I I I I I I I 

orf 125ng-l LI GTVLAVMLP VTE YKNFLLL I GS VFAPMAAVL I ADFFVLKRREE I EGFDFAGLVLWLAG 

310 320 330 340 350 360 

360 370 380 390 400 

orf 125-1 .pep FILYRFLLSSGWESSIGLTAPVMSAVAIATVSVRLFFKKTQSLQRNPSX 

30 | | | | | | | | | | | | | | | | || | | | | | | | | | | | | | | | | | | | I II II I I M I I I 

orf 12 5ng-l FILYRFLLSSGWESSIGLTAPVMSAVAIATVSVRLFFKKTQSLQRNPSX 

370 380 390 400 

Based on this analysis, including the presence of putative leader sequence and transmembrane 
35 domains in the gonococcal protein, it is predicted that the proteins from N. meningitidis and 
N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
raising antibodies. 



Example 96 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 809>] (SEP ID 
40 NO: 809) : 



1 ATGACCCGTA TCGCCATCCT CGGCGGCGGC CTCTCGGGAA GGCTGACCGC 
51 GTTGCAGCTT GCAGAACAAG GTTATCAGAT TGCACTTTTC GATAAAAGCT 
101 GCCGCCGGGG CGAACACGCC GCCGCCTATG TAGCCGCCGC CATGCTCGCG 
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151 CCTGCAGCGG A . ACGGTCGA AGCCACGCCC GAAGTGGTCA GGCTGGGCAG 

201 GCAGAGCATC CCGCTTTGGC GCGGCATCCG ATGCCGTCTG AACACGCACA 

251 CGATGATGCA GGAAAACGGC AGCCTGATTG TATGGCACGG GCAGGACAAG 

301 CCATTATCCA GCGAGTTCGT CCGCCATCTC AAACGCGGCG GCGT.ACGGA 

351 TGACGAAATC GTCCGTTGGC GCGCCGACGA CATCGCCGAA CGCGAACCGC 

4 01 AACTCGGCGG ACGTTTTTAA GACGGCATCT ACCTGCCGAC CGAAGC . CAG 

4 51 CTCGACGGGC GGCAATTATA GTCTGCACTT GCCGACGCTT TGGACGAACT 

501 GAACGTCCCC TGCCATTGGG AACACGAATG CGTCCCCGAA GCCTGCAAG . . 

This corresponds to the amino acid sequence [<SEQ ID 810; PRF126>] (SEP ID NO: 810; 
ORF126) : 

1 MTRIAILGGG LSGRLTALQL AEQGYQIALF DKSCRRGEHA AAYVAAAMLA 

51 PAAXTVEATP EWRLGRQSI PLWRGIRCRL NTHTMMQENG SLIVWHGQDK 

101 PLSSEFVRHL KRGGXTDDE I VRWRADDIAE REPQLGGRFX DGIYLPTEXQ 

151 LDGRQLXSAL ADALDELNVP CHWEHECVPE ACK. . . 

Further work revealed the complete nucleotide sequence [<SEQ ID 81 1>] (SEP ID NO: 811) : 



1 ATGACCCGTA TCGCCATCCT CGGCGGCGGC CTCTCGGGAA GGCTGACCGC 

51 GTTGCAGCTT GCAGAACAAG GTTATCAGAT TGCACTTTTC GATAAAGGCT 

101 GCCGCCGGGG CGAACACGCC GCCGCCTATG TTGCCGCCGC CATGCTCGCG 

151 CCTGCGGCGG AAGCGGTCGA AGCCACGCCC GAAGTGGTCA GGCTGGGCAG 

201 GCAGAGCATC CCGCTTTGGC GCGGCATCCG ATGCCGTCTG AACACGCACA 

251 CGATGATGCA GGAAAACGGC AGCCTGATTG TGTGGCACGG GCAGGACAAG 

301 CCATTATCCA GCGAGTTCGT CCGCCATCTC AAACGCGGCG GCGTAGCGGA 

351 TGACGAAATC GTCCGTTGGC GCGCCGACGA CATCGCCGAA CGCGAACCGC 

4 01 AACTCGGCGG ACGTTTTTCA GACGGCATCT ACCTGCCGAC CGAAGGCCAG 

451 CTCGACGGGC GGCAAATATT GTCTGCACTT GCCGACGCTT TGGACGAACT 

501 GAACGTCCCC TGCCATTGGG AACACGAATG CGTCCCCGAA GGCCTGCAAG 

551 CCCAATACGA CTGGCTGATC GACTGCCGCG GCTACGGCGC AAAAACCGCG 

601 TGGAACCAAT CCCCCGAGCA CACCAGCACC CTGCGCGGCA TACGCGGCGA 

651 AGTGGCGCGG GTTTACACAC CCGAAATCAC GCTCAACCGC CCCGTGCGTC 

701 TGCTCCATCC GCGTTATCCG CTCTACATCG CCCCGAAAGA AAACCACGTC 

751 TTCGTCATCG GCGCGACCCA AATCGAAAGC GAAAGCCAAG CCCCCGCCAG 

801 CGTGCGTTCA GGGTTGGAAC TCTTGTCCGC ACTCTATGCC ATCCACCCCG 

851 CCTTCGGCGA AGCCGACATC CTCGAAATCG CCACCGGCCT GCGCCCCACG 

901 CTCAACCACC ACAACCCCGA AATCCGTTAC AACCGCGCCC GACGCCTGAT 

951 TGAAATCAAC GGCCTTTTCC GCCACGGTTT CATGATCTCC CCCGCCGTAA 

1001 CCGCCGCCGC CGCCAGATTG GCAGTGGCAC TGTTTGACGG AAAAGACGCG 

1051 CCCGAACGCG ATAAAGAAAG CGGTTTGGCG TATATCCGAA GACAAGATTA 

1101 A 

This corresponds to the amino acid sequence [<SEQ ID 812; ORF126-l>] (SEP ID NO: 812; 
ORF126-1) : 



1 MTRIAILGGG LSGRLTALQL AEQGYQIALF DKGCRRGEHA AAYVAAAMLA 

51 PAAEAVEATP EWRLGRQSI PLWRGIRCRL NTHTMMQENG SLIVWHGQDK 

101 PLSSEFVRHL KRGGVADDEI VRWRADDIAE REPQLGGRFS DGIYLPTEGQ 

151 LDGRQILSAL ADALDELNVP CHWEHECVPE GLQAQYDWLI DCRGYGAKTA 

201 WNQSPEHTST LRGIRGEVAR VYTPEITLNR PVRLLHPRYP LYIAPKENHV 

251 FVIGATQIES ESQAPASVRS GLELLSALYA IHPAFGEADI LEIATGLRPT 

301 LNHHNPEIRY NRARRLIEIN GLFRHGFM IS PAVTAAAARL AVAL F DGKD A 

3 51 PERDKESGLA YIRRQD* 



Computer analysis of this amino acid sequence gave the following results: 
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Homology with a predicted ORF from N. meningitidis (strain A) 



ORF126 (SEP ID NO: 810) shows 90.0% identity over a 180aa overlap with an ORF (ORF126a) 
(SEP ID NO: 814) from strain A of N. meningitidis: 



10 20 30 40 50 60 

MTRIAILGGGLSGRLTALQLAEQGYQIALFDKSCRRGEHAAAYVAAAMLAPAAXTVEATP 

1 1 : 1 I I I I I I I I I M I ! I I I I I ! I I I I I I II HI I I I I I I I I I I I M I I I ! I HUM 
MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP 
10 20 30 40 50 60 

70 80 90 100 110 120 

EWRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGXTDDEI 

MINIM I I I I I I I I I : I = I Ml M I I I I I I I I I I I M M I I I I I I I Ml I 
EWRLGRQXIPLWRGIRCHLKTPAMMXENGSLIVWHGQDKPLSNEFVRHLKRGGVADDXI 

70 80 90 100 110 120 



5 orfl26.pep 
orf 126a 

10 orf 126. pep 

orf 126a 



130 140 150 160 170 180 

15 orf 126 .pep VRWRADDIAEREPQLGGRFXDGIYLPTEXQLDGRQLXSALADALDELNVPCHWEHECVPE 

I Ml M MM MINI I II MM II II 111111= I II 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 M I 

orf 126a VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPE 

130 140 150 160 170 180 



20 The complete length PRF126a nucleotide sequence [<SEQ ID 813>] (SEPIDNP: 813) is: 

1 ATGACCCGTA TCGCCATCCT CGGCGGCGGC CTCTCNGGAA GGCTGACCGC 

51 ACTGCAGCTT GCAGAACAAG GTTATCAGAT TGCACTTTTC GATAAAGGCT 

101 GCCGCCGGGG CGAACACGCC GCCGCCTATG TTGCCGCCGC CATGCTCGCG 

151 CCTGCGGCGG AAGCGGTCGA AGCCACGCCT GAAGTGGTCA GGCTGGGCAG 

25 201 GCAGANCATC CCGCTTTGGC GCGGCATCCG ATGCCATCTG AAAACGCCTG 

251 CCATGATGCA NGAAAACGGC AGCCTGATTG TGTGGCACGG GCAGGACAAA 

301 CCTTTATCCA ACGAGTTCGT CCGCCATCTC AAACGCGGCG GCGTAGCGGA 

3 51 TGACNAAATC GTCCGTTGGC GCGCCGACGA CATCGCCGAA CGCGAACCGC 

401 AACTCGGCGG ACGTTTTTCA GACGGCATCT ACCTGCCGAC CGAAGGCCAG 

30 451 CTCGACGGGC GGCAAATATT GTCTGCACTT GCCGACGCTT TGGACGAACT 

501 GAACGTCCCC TGCCATTGGG AACACGAATG TGCCCCCGAA GACTTGCAAG 

551 CCCAATACGA CTGGCTGATC GACTGCCGCG GCTACGGCGC AAAAACCGCG 

601 TGGAACCAAT CCCCCGANNA NACCAGCACC CTGCGCGGCA TACGCGGCGA 

.651 AGTGGCGCGG GTTTACACAC CCGAAATCAC GCTCAACCGC CCCGTGCGCC 

35 701 TGCTACACCC GCGCTATCCG CTNTACATCG CCCCGAAAGA AAACCNCGTC 

751 TTCGTCATCG GCGCGACCCA AATCGAAAGC GAAAGCCAAG CACCTGCCAG 

801 CGTGCGTTCC GGGCTGGAAC TCTTATCCGC ACTCTATGCC GTCCACCCCG 

851 CCTTCGGCGA AGCCGACATC CTCGAAATCG CCACCGGCCT GCGCCCCACG 

901 CTCAATCACC ACAACCCCGA AATCCGTTAC AACCGCGCCC GACGCCTGAT 

40 951 TGAAATCAAC GGCCTTTTCC GCCACGGTTT CATGATCTCC CCCGCCGTAA 

1001 CCGCCGCCGC CGTCAGATTG GCAGTGGCAC TGTTTGACGG AAAAGANGCG 

1051 CCCGAACGCG ATGAAGAAAG CGGTTTGGCG TATATCCGAA GACAAGATTA 

1101 A 



45 This encodes a protein having amino acid sequence [<SEQ ID 814>] (SEPIDNP: 814) : 



1 MTRIAILGGG LSGRLTALQL AEQGYQIALF DKGCRRGEHA AAYVAAAMLA 

51 PAAEAVEATP EWRLGRQXI PLWRGIRCHL KTPAMMXENG SLIVWHGQDK 

101 PLSNEFVRHL KRGGVADDXI VRWRADDIAE REPQLGGRFS DGIYLPTEGQ 

151 LDGRQILSAL ADALDELNVP CHWEHECAPE DLQAQYDWLI DCRGYGAKTA 
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2 01 WNQSPXXTST LRGIRGEVAR VYTPEITLNR PVRLLHPRYP LYIAPKENXV 

251 FVIGATQIES ESQAPASVRS GLELLSALYA VHPAFGEADI LEIATGLRPT 

301 LNHHNPEIRY NRARRLIEIN GLFRHGFM IS PAVTAAAVRL AVALF DGKXA 

351 PERDEESGLA YIRRQD* 

ORF126a (SEO ID NO: 814) and ORF126-1 (SEP ID NO: 812) show 95.4% identity in 366 aa 
overlap: 



10 



10 20 30 40 50 60 

orf 126a . pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP 

1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 126-1 MTRIAILGGGLSGRLTALQIiAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP 

10 20 30 40 50 60 



15 



70 80 90 100 110 120 

orf 126a . pep EWRLGRQXIPLWRGIRCHLKTPAMMXENGSLIVWHGQDKPLSNEFVRHLKRGGVADDXI 

MINIM MMMMhhl Ml 1 1 1 II I M 1 1 1 ! 1 1 M 1 1 1 1 1 1 1 1 1 1 1 I 

orf 126 - 1 EWRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI 

70 80 90 100 110 120 



20 



130 140 150 160 170 180 

orf 126a . pep VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPE 

1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 ih M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 i 1 1 1 1 M I 

orfl26-l VRWRADD I AEREPQLGGRFSDGI YLPTEGQLDGRQI LS ALADALDELNVPCHWEHECVPE 

130 140 150 160 170 180 



190 200 210 220 230 240 

orf 126a . pep DLQAQYDWL I DCRGYGAKTAWNQS PXXTSTLRG I RGEVARVYTPE I TLNRP VRLLHPRYP 

MMMMMMMMMMMM M M M M M M M M M M M M M M M M I 

orf 126 - 1 GLQAQYDWL I DCRGYGAKTAWNQS PEHTSTLRG I RGEVARVYTPE I TLNRP VRLLHPRYP 

190 200 210 220 230 240 



250 260 270 280 290 300 

orf 126a . pep LY I APKENXVFVIGATQ I ESESQAPASVRSGLELLSALYAVHPAFGEAD I LEIATGLRPT 

MMMM 1 1 1 i I ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 h 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 

orf 126 - 1 LY I APKENHVFVIGATQ I ESESQAPASVRSGLELLSALYAIHPAFGEAD I LEIATGLRPT 

250 260 270 280 290 300 



310 320 330 340 350 360 

orf 126a . pep LNHHNPEIRYNRARRLIEINGLFRHGFMISPAVTAAAVRLAVALFDGKXAPERDEESGLA 

1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 II h 1 1 M 1 1 M 1 1 1 1 1 1 h 1 1 1 1 1 

or f 1 2 6 - 1 LNHHNPE I RYNRARRL I E INGLFRHGFM I S PAVTAAAARLAVALFDGKDAPERDKESGLA 

310 320 330 340 350 360 



orf 126a. pep YIRRQDX 

MMMI 

orfl26-l YIRRQDX 

Homology with a predicted ORF from N. gonorrhoeae 

ORF126 rSEO ID NO: 810) shows 90% identity over a 180 aa overlap with a predicted ORF 
(ORF126ng) f SEO ID NO: 816) from N. gonorrhoeae: 
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orf 126 .pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKSCRRGEHAAAYVAAAMLAPAAXTVEATP 60 

Illlhlllllllllllllllllllll lllh hllllllllllllMIII :|llll 

or f 12 6ng MTRI AVLGGGLSGRLTALQLAEQGYQIELFDKGTRQGEHAAAYVAAAMLAPAAEAVEATP 6 0 

orf 126 . pep E WRLGRQS I PLWRG I RCRLNTHTMMQENGS L I VWHGQDKPLSS E FVRHLKRGGXTDDE I 120 

I :| I II Ml I I I M I II ! I I I I I I M I I I I I I I I I I I I I I I II I I I I I I Mill 
orf 126ng EVIRLGRQSIPLWRGIRCRLNTLTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI 120 

orf 126 . pep VRWRADDIAEREPQLGGRFXDGIYLPTEXQLDGRQLXSALADALDELNVPCHWEHECVPE 180 

I I I I I M I I I I I I I I I I llllllll IIIMh I I I I I I I I I I I I I I I II ! hh 
orf 126ng VRWRADEIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPQ 180 

An ORF126ng nucleotide sequence [<SEQ ID 815>] (SEP ID NO: 815) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 81 6>] (SEP ID NO: 816) : 

1 MTRIAVLGGG LSGRLTALQL AEQGYQIELF DKGTRQGEHA AAYVAAAMLA 

51 PAAEAVEATP EVIRLGRQSI PLWRGIRCRL NTLTMMQENG SLIVWHGQDK 

101 PLSSEFVRHL KRGGVADDEI VRWRADEIAE REPQLGGRFS DGIYLPTEGQ 

151 LDGRQILSAL ADALDELNVP CHWEHECAPQ DLQAQYDWVI DCRGYGAKTA 

201 WNQSPEHTST LRGIRGEVRG FTRPKSRSTA PCACCTRAIR STSPRKKTTS 

251 SSSARPKSKA KAKPPPAYVP GWNSYPRSMP STPPSAKPTS SKWRPGLRPT 

301 LNHHNPEIRY SRERRLIEIN GLFRHGFM IS PAVTAAAVRL AVALF DGKDA 

351 PERDEESGLA YIGRQD* 

Further work revealed the following gonococcal DNA sequence [<SEQ ID 817>] (SEP ID NP: 
817) : 

1 ATGACCCGTA TCGCCGTCCT CGGAGGCGGC CTTTCCGGAA GGCTGACCGC 

51 ATTGCAGCTT GCAGAACAAG GTTATCAGAT TGAACTTTTC GACAAGGGCA 

101 CCCGCCAAGG CGAACACGCC GCCGCCTATG TTGCCGCCGC GATGCTCGCG 

151 CCTGCGGCGG AAGCGGTCGA GGCAACGCCC GAAGTCATCA GGCTGGGCAG 

201 GCAGAGCATT CCGCTTTGGC GCGGCATCCG ATGCCGTCTG AACACGCTCA 

251 CGATGATGCA GGAAAACGGC AGCCTGATTG TGTGGCACGG GCAGGACAAG 

3 01 CCATTATCCA GCGAGTTCGT CCGCCATCTC AAACGCGGCG GCGTAGCGGA 

351 TGACGAAATC GTCCGTTGGC GCGCCGATGA AATCGCCGAA CGCGAACCGC 

401 AACTCGGCGG ACGTTTTTCA GACGGCATCT ACCTGCCGAC CGAAGGCCAG 

451 CTCGACGGGC GGCAAATATT GTCTGCACTT GCCGACGCTT TGGACGAACT 

501 GAACGTCCCT TGCCATTGGG AACACGAATG CGCCCCCCAA GACCTGCAAG 

551 CCCAATACGA CTGGGTAATC GACTGCCGGG GCTACGGCGC GAAAACCGCG 

601 TGGAACCAAT CCCCCGAGCA CACCAGCACC TTGCGCGGCA TACGCGGCGA , 

651 AGTGGCGCGG GTTTACACGC CCGAAATCAC GCTCAACCGC CCCGTGCGCC 

701 TGCTGCACCC GCGCTATCCG CTCTACATCG CCCCGAAAGA AAACCACGTC 

751 TTCGTCATCG GCGCGACCCA AATCGAAAGC GAAAGCCAAG CCCCCGCCAG 

801 CGTACGTTCC GGGCTGGAAC TCTTATCCGC GCTCTATGCC GTCCACCCCG 

851 CCTTCGGCGA AGCCGACATC CTCGAAATCG CCGCCGGCCT GCGCCCCACG 

901 CTCAACCACC ACAACCCCGA AATCCGCTAC AGCCGCGAAC GCCGCCTCAT 

951 CGAAATCAAC GGCCTTTTCC GGCACGGCTT TATGATTTCC CCCGCCGTAA 

1001 CCGCCGCCGC CGTCAGATTG GCAGTGGCAC TGTTTGACGG AAAAGACGCG 

1051 CCCGAACGTG ATGAAGAAAG CGGTTTGGCG TATATCGGAA GACAAGATTA 

1101 A 

This corresponds to the amino acid sequence [<SEQ ID 818; PRF126ng-l>] (SEP ID NP: 818; 
PRF126ng-l) : 
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1 MTRIAVLGGG LSGRLTALQL AEQGYQIELF DKGTRQGEHA AAYVAAAMLA 
51 PAAEAVEATP EVIRLGRQSI PLWRGIRCRL NTLTMMQENG SLIVWHGQDK 
101 PLSSEFVRHL KRGGVADDE I VRWRADEIAE REPQLGGRFS DGIYLPTEGQ 
151 LDGRQILSAL ADALDELNVP CHWEHECAPQ DLQAQYDWVI DCRGYGAKTA 
5 201 WNQSPEHTST LRGIRGEVAR VYTPEITLNR PVRLLHPRYP LYIAPKENHV 

251 FVIGATQIES ESQAPASVRS GLELLSALYA VHPAFGEADI LEIAAGLRPT 
301 LNHHNPE I RY SRERRLIEIN GLFRHGFM IS PAVTAAAVRL AVALFD GKDA 
351 PERDEESGLA YIGRQD* 

10 ORF126ng-l (SEP ID NO: 818) and ORF126-1 (SEP ID NO: 812) show 95.1% identity in 366 aa 
overlap: 

10 20 30 40 50 60 

orf 126-1 .pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP 

i 1 1 IM 1 1 1 1 M II 1 1 1 1 1 1 1 1 M II Mill h I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 

1 5 orf 126ng- 1 MTRIAVLGGGLSGRLTALQLAEQGYQIELFDKGTRQGEHAAAYVAAAMLAPAAEAVEATP 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 126 - 1 . pep EWRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI . 

:||||lll lllllllllll I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I 
20 orf 126ng-l EVIRLGRQSIPLWRGIRCRLNTLTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI 

70 80 90 100 110 120 

130 140 150 160 170 180 

orf 126-1 .pep VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECVPE 

1 1 1 1 1 1: 1 1 1 : 1 1 1 1 M I II 1 1 ! 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I hh 

25 orf 126ng-l VRWRADEIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPQ 

130 140 150 160 170 180 



30 



190 200 210 220 230 240 

orf 126-1 .pep GLQAQ YDWL I DCRGYGAKTAWNQSPEHTSTLRG I RGEVAR VYTPEITLNR PVRLLHPRYP 
IIIIM llllllllllllllllllllllll IIIIIMMIIIIIII MINI 
orf 126ng-l DLQAQYDWVIDCRGYGAKTAWNQSPEHTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYP 

190 200 210 220 230 240 



35 



250 260 270 280 290 300 

orf 126-1 .pep LYIAPKENHVFVIGATQIESESQAPASVRSGLELLSALYAIHPAFGEADILEIATGLRPT 

1 1 1 II ! I M I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 h 1 1 1 1 1 1 1 1 1 1 1 M: 1 1 1 1 1 

orf 126ng-l LYIAPKENHVFVIGATQ I ESESQAPASVRSGLELLSALYAVHPAFGEADI LEIAAGLRPT 

250 260 270 280 290 300 



40 



310 320 330 340 350 360 

orf 126-1 .pep LNHHNPEIRYNRARRLIEINGLFRHGFMISPAVTAAAARLAVALFDGKDAPERDKESGLA 

IIIIIIIIIM I I I I I I I I II I I I I I I I I I I I 'hi li I I I I I I I I I I I I I: I I M I 
orf 126ng-l LNHHNPE I RYSRERRL I E INGLFRHGFMI SPAVTAAAVRLAVALFDGKDAPERDEESGLA 

310 320 330 . 340 350 360 



45 



orf 126 - 1 . pep YIRRQDX 
II I I I I 

orf 126ng-l YIGRQDX 



Furthermore, PRF126ng-l (SEP ID NO: 818) shows homology to a putative Rhizobium oxidase 



flavoprotein (SEPIDNP: 1163) : 
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gi | 2627327 (AF004408) putative amino acid oxidase flavoprotein [Rhizobium etli] 
Length = 327 
Score = 169 bits (423), Expect = 3e-41 

Identities = 112/329 (34%), Positives = 163/329 (49%), Gaps = 25/329 (7%) 

Query: 3 RIAVLGGGLSGRLTALQLAEQGYQIELFDKGTRQGEHXXXXXXXXXXXXXXXXXXXXXXX 62 

RI V G G++G A QL G+++ L ++ G 
Sbjct : 2 RILVNGAGVAGLTVAWQLYRHGFRVTLAERAGTVGA-GASGFAGGMLAPWCERESAEEPV 60 



Query: 63 IRLGRQSIPLWRGIRCRl^TLTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEIVR 122 

+ LGR + W + G+L+V G+D F R G DE+ 

Sbjct: 61 LTLGRLAADWWEAA LPGHVHRRGTLWAGGRDTGELDRFSRRTS - GWEWLDEVA- 113 

Query: 123 WRADEIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPQDL 182 

I A EP L GRF ++ E LD RQ L+ALA L++ + + 
Sbjct: 114 IAALEPDLAGRFRRALFFRQEAHLDPRQALAALAAGLEDARMRLTLG WGES 165 

Query: 183 QAQYDWVIDCRGYGAKTAWNQSPEHTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYPLY 242 

+D V+DC G LRG+RGE+ V T E++L+RPVRLLHPR+P+Y 

Sbjct: 166 DVDHDRWDCTGAA QIGRLPGLRGVRGEMLCVETTEVSLSRPVRLLHPRHPIY 218 

Query: 24 3 IAPKENHVFVIGATQIESESQAPASVRSGLELLSALYAVHPAFGEADILEIAAGLRPTLN 302 

I P+ + + F++GAT IES+ P + RS +ELL+A YA+HPAFGEA + E AG+RP 
Sbjct: 219 I VPRDKNRFMVGATM I ESDDGGP I TARS LMELLNAAYAMHPAFGEARVTETGAGVRP AYP 278 

Query: 303 HHNPEIRYSRERRLIEINGLFRHGFMISP 331 

+ P R ++E R + +NGL+RHGF+++P 
Sbjct: 279 DNLP- -RVTQEGRTLHVNGLYRHGFLLAP 305 

This analysis suggests that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, 
could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 97 

The following DNA sequence, believed to be complete, was identified in N. meningitidis [<SEQ ID 
819>] (SEP ID NO: 819) : 



1 


ATGACTGATA 


51 


GATATTGTCT 


101 


TTGAGAAAGC 


151 


CATTTTATGG 


201 


TACCAAGTGG 


251 


GTTTGAATGG 


301 


AAGGCGGTAG 


351 


TGAAAATCTA 


401 


GACGGGCTGG 


451 


GTAG 



ATCGGGGGTT 
GTACTTGCTT 
AAAGATAAAT 
AAAAGTTTTA 
CCAAGTTTGC 
AATCGtCGCG 
CCATAGATAA 
GTAACCTTTA 
ATTATTTTAA 



TACGCTGGTT 
TAATTGTTTA 
GCAGTGCGGG 
TCTGCAGAAT 
CGATTAAAGA 
CGGG . . GCTT 
AGATAAAAAT 
^TTTGCAAGA 
AGGAAATGAT 



GAATTAATAT 
TCCGAGCTAT 
CAGCCTTGTT 
GGGAGGTTTA 
GGCAGAAGGC 
TAGACAGTAA 
CCTTTTATTA 
AGTCCGCCAG 
AAGGACTGCA 



CAGTGGTCTT 
CGCAATTATG 
AGAAAATGCA 
AACAAACATC 
TTTTGTATCC 
ATTCATGTTG 
TTAAGATGAA 
TTCGTGTAGT 
AGTTACTTAA 



This corresponds to the amino acid sequence [<SEQ ID 820; ORF127>] fSEO ID NO: 820; 
ORF127) : 



1 MTDNRGFTLV ELISWLILS VLALIVYPSY RNYVEKAKIN AVRAALLENA 
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51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIVA RXALDSKFML 
101 KAVAIDKDKN PFIIKMNENL VTFICKKSAS SCSDGLDYFK GNDKDCKLLK 
151 * 

Further work revealed the following DNA sequence [<SEQ ID 82 1>] (SEP ID NO: 821) : 

1 ATGACTGATA ATCGGGGGTT TACGCTGGTT GAATTAATAT CAGTGGTCTT 

51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG 

101 TTGAGAAAGC AAAGATAAAT GCAGTGCGGG CAGCCTTGTT AGAAAATGCA 

151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGGTTTA AACAAACATC 

201 TACCAAGTGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC 

251 GTTTGAATGG AATCGCGCGC GGGGCTTTAG ACAGTAAATT CATGTTGAAG 

301 GCGGTAGCCA TAGATAAAGA TAAAAATCCT TTTATTATTA AGATGAATGA 

351 AAATCTAGTA ACCTTTATTT GCAAGAAGTC CGCCAGTTCG TGTAGTGACG 

4 01 GGCTGGATTA TTTTAAAGGA AATGATAAGG ACTGCAAGTT ACTTAAGTAG 

This corresponds to the amino acid sequence [<SEQ ID 822; ORF127-l>] (SEP ID NO: 822; 
ORF127-n : 

1 MTDNRGFTL V ELISWLILS VLALIVY PSY RNYVEKAKIN AVRAALLENA 
51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIAR GALDSKFMLK 
101 AVAIDKDKNP FIIKMNENLV TFICKKSASS CSDGLDYFKG NDKDCKLLK* 

Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted PRF from N. meningitidis (strain A) 

PRF127 (SEP ID NP: 820) shows 98.0% identity over a 150aa overlap with an PRF (PRF127a) 
fSEP ID NP: 824) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 127 . pep MTDNRGFTLVELISWLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN 

IMIMMMMI IMIMMMMMMI MIMIN MMMMIIMIM M 

orf 127a MTDNRGFTLVELISWLILSVLALIVYPSYRNYVEKAKINTVRAALLENAHFMEKFYLQN 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 127. pep GRFKQTSTKWPSLPIKEAEGFCIRLNGIVARXALDSKFMLKAVAIDKDKNPFIIKMNENL 

I MM III II II II I MM II Mill II II MINIMUM INI MINI III 

orf 12 7a GRFKQTSTKWPSLPIKEAEGFCIRLNGI-ARGALDSKFMLKAVAIDKDKNPFIIKMNENL 

70 80 90 100 110 

130 140 150 

orf 127 .pep VTFI CKKSASSCSDGLDYFKGNDKDCKLLKX 

I I I I M I I I I I I I I I I ! I I I I M I I I I I I 
orf 127a VTFI CKKSASSCSDGLDYFKGNDKDCKLLKX 

120 130 140 150 

The complete length PRF127a nucleotide sequence [<SEQ ID 823>] (SEPIDNP: 823) is: 



1 ATGACTGATA ATCGGGGGTT TACGCTGGTT GAATTAATAT CAGTGGTCTT 
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51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG 

101 TTGAGAAAGC AAAGATAAAT ACAGTGCGGG CAGCCTTGTT AGAAAATGCA 

151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGATTTA AACAAACATC 

201 TACCAAATGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC 

251 GTTTGAATGG AATCGCGCGC GGGGCCTTAG ACAGTAAATT CATGTTGAAG 

301 GCGGTAGCCA TAGATAAAGA TAAAAATCCT TTTATTATTA AGATGAATGA 

351 AAATCTAGTA ACCTTTATTT GCAAGAAGTC CGCCAGTTCG TGTAGTGACG 

401. GGCTGGATTA TTTTAAAGGA AATGATAAGG ACTGCAAGTT ACTTAAGTAG 

This encodes a protein having amino acid sequence [<SEQ ID 824>] (SEP ID NO: 824) : 

1 MTDNRGFTL V ELISWLILS VIAL IV Y PS Y RNYVEKAKIN TVRAALLENA 

51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIAR GALDSKFMLK 

101 AVAIDKDKNP FIIKMNENLV TFICKKSASS CSDGLDYFKG NDKDCKLLK* 

ORF127a (SEP ID NO: 824) and ORF127-1 (SEP ID NO: 822) show 99.3% identity in 149 aa 
overlap: 



10 20 30 40 50 60 

orf 12 7a. pep MTDNRGFTLVELISWLILSVLALIVYPSYRNYVEKAKINTVRAALLENAHFMEKFYLQN 

II II III Mill 1 1 II I III II MM MINIM I MM hill II Illlllllllllll 

orf 12 7-1 MTDNRGFTLVELISWLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN 

• 10 20 30 40 50 60 

70 80 90 100 110 120 

orf 12 7a. pep GRFKQTSTKWPS LP I KEAEGFC I RLNGI ARGALDSKFMLKAVAIDKDKNPF I I KMNENLV 

I M II 1 1 1 1 1 II 1 1 1 1 M I II M 1 1 1 II I M II II II M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 

orf 12 7-1 GRFKQTSTKWPSLP I KEAEGFC I RLNGI ARGALDSKFMLKAVAIDKDKNPF I I KMNENLV 

70 80 90 100 110 120 

130 140 150 

orf 127a .pep TFICKKSASSCSDGLDYFKGNDKDCKLLKX 

1 1 1 1 E 1 1 1 1 1 1 1 1 1 1 1 E 1 1 1 1 1 1 1 i 1 1 1 i I 

orfl27-l TFICKKSASSCSDGLDYFKGNDKDCKLLKX 

130 140 150 ■ 

Homology with a predicted PRF from TV. gonorrhoeae 

PRF127 (SEP ID NP: 820) shows 97.3% identity over a 150 aa overlap with a predicted PRF 
(PRF127ng) (SEP ID NP: 826) from A '.gonorrhoeae: 

orf 127 .pep MTDNRGFTLVEL I S WL ILS VLAL I VYPS YRNYVEKAKINAVRAALLENAHFMEKFYLQN 60 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 12 7ng MTDNRGFTLVELISWLILSVLALIVYPSYRNYVEKAKINAVRAAFLENAHFMEKFYLQN 60 

orf 127 .pep GRFKQTS TKWPSLP I KEAEGFC I RLNG I VARXALDS KFMLKAVA I DKDKNP F 1 1 KMNENL 120 

MIMMMMIMMIMMMIMM II , 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 

orf 12 7ng GRFKQTSTKWPS LP I KEAEGFC I RLNGI - ARGALDS KFMLKAVA I DKDKNP F 1 1 KMNENL 119 

orf 127 .pep VTFICKKSASSCSDGLDYFKGNDKDCKLLK 150 

Illlllllllllll I 1 I I I I i I I I I I 1 I I 
orf 12 7ng VTFICKKSASSCSDRLDYFKGNDKDCKLLK 14 9 
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The complete length ORF127ng nucleotide sequence [<SEQ ID 825>] (SEP ID NO: 825) is: 



1 ATGACTGATA ATCGGGGGTT TACACTGGTT GAATTAATAT CAGTGGTCTT 

51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG 

101 TTGAGAAAGC AAAGATAAAT GCAGTGCGGG CAGCCTTGTT AGAAAATGCA 

151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGATTTA AACAAACATC 

201 TACCAAATGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC 

251 GTTTGAATGG AATCGCGCGC GGGGCTTTAG ACAGTAAATT CATGTTGAAG 

301 GCGGTAGCCA TAGATAAAGA TAAAAATCCT TTTATTATTA AGATGAATGA 

351 AAATCTAGTA ACCTTTATTT GCAAGAAGTC CGCCAGTTCG TGTAGTGACG 

4 01 GGCTGGATTA TTTTAAAGGA AATGATAAGG ACTGCAAGTT ACTTAAGTAG 

This encodes a protein having amino acid sequence [<SEQ ID 826>] (SEP ID NO: 826) : 



1 MTDNRGFTL V ELISWLILS VLALIV YPSY RNYVEKAKIN AVRAAFLENA 
51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIAR GALDSKFMLK 
101 AVAIDKDKNP FIIKMNENLV TFICKKSASS CSDRLDYFKG NDKDCKLLK* 

ORF127ng (SEP ID NO: 826) and ORF127-1 (SEP ID NO: 822) show 100.0% identity in 149 aa 
overlap: 



10 20 30 40 50 60 

MTDNRGFTLVELISWLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN 

1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

MTDNRGFTLVELISWLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN 
10 20 30 40 50 60 

70 80 90 100 110 120 

GRFKQTSTKWPSLPIKEAEGFCIRLNGIARGALDSKFMLKAVAIDKDKNPFIIKMNENLV 

I I I 1 I I I I I I I II I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I 
GRFKQTSTKWPSLPIKEAEGFCIRLNGIARGALDSKFMLKAVAIDKDKNPFIIKMNENLV 

70 80 90 100 110 120 

130 140 150 

T F I CKKSAS S CSDGLD Y FKGNDKDCKLLKX 

II 1 1 IMMMi I 1 1 1 M II i II 1 1 1 1 1 

TFICKKSASSCSDGLDYFKGNDKDCKLLKX 
130 140 150 

This analysis, including the fact that the predicted transmembrane domain is shared by the 
meningococcal and gonococcal proteins, suggests that the proteins from ^meningitidis and 
N. gonorrhoeae, and their epitopes, could be useful antigens for vaccines or diagnostics, or for 
raising antibodies. 



Example 98 



orf 127-1 .pep 
orf 127ng-l 

orf 127-1 .pep 
orf 127ng-l 

orf 127-1 .pep 
orf I27ng-1 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 827>] (SEP ID 
NO: 827) 
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1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 



. GTGTCGCTGG 
CAACCAAATG 
ATTTATCTGG 
CCCCGTACTG 
TGTATCCCCT 
GTGCTGCGTA 
GTTTTTGCCA 
ATTACCTTTC 
GCGGTTTACG 
ACGGCAGTTG 
TCGTGATTGA 
CCCTGCCTGC 
TCCGACCCGC 
CTTATTCCCT 
ATTAGAGGCG 



CTTCGGTGAT 
CGGAAAACCC 
GGTTTCAGCA 
CATATCTGGT 
TTTGCTGATA 
ACATCAGCAT 
AGCGGGTTTT 
GACACTGAGG 
GGCAAACGCA 
CTTTCATCAC 
CAAACACAAT 
TGACGGCACT 
ATCCTGTCGG 
ATACCTGTAC 
GGAAACAGCT 



TGCCTCTCAA 
GTGGAGCTAT 
GGGGTATTTC 
CTTTGGCAGT 
TTTTGCTGCA 
CATCCTGTTT 
ATACCGACAT 
TTTCCCGAGC 
AAACGGCAGA 
TCTGCTTCGG 
CCGTTTATCC 
GCTTATCCGG 
CAAGCCCCAT 
CATTGGATTT 
CGGACTGCCT 



ATCTTCCTTT 
CTGCGGTTTT 
GATTTGAGTG 
AGAGGAACAG 
AAAAAACCAA 
TTGATTTTGA 
CCTCAACCAA 
TGTTGGCAGG 
CGGCAAACAG 
CGCATTGCTT 
CGGGAATGAC 
AGTATGCAAT 
CGTATTTGTC 
TTATTGCTTT 
GCCG . . 



ACGAAGATTT 

CTTGTCCAAT 

CCGACGAGAA 

TATTACCTCC ' 

ATCGCTACGG 

CTGCCTCATC 

CCCAATACTT 

TTCGCTGCTG 

CAAATGGAAA 

GCCTGCCTGT 

CCTGCTCCTT 

ACGGGACACT 

GGCAAAATCT 

CGCTCCGCTC 



This corresponds to the amino acid sequence [<SEQ ID 828; ORF128>] (SEP ID NO: 828; 
ORF128) : 



1 . .VSLASVIASQ IFLYEDFNQM RKTVELSAVF LSNIYLGFQQ GYFDLSADEN 

51 PVLHIWSLAV EEQYYLLYPL LLIFCCKKTK SLRVLRNISI ILFLILTASS 

101 FLPSGFYTDI LNQPNTYYLS TLRFPELLAG SLLAVYGQTQ NGRRQTANGK 

151 RQLLSSLCFG ALLACLFVID KHNPFIPGMT LLLPCLLTAL LIRSMQYGTL 

201 PTRILSASPI VFVGKISYSL YLYHWIFIAF APLIRGGKQL GLPA. . 

Further work revealed the complete nucleotide sequence [<SEQ ID 829>] (SEP ID NO: 829) : 



1 ATGCAAGCTG TCCGATACAG ACCGGAAATT GACGGATTGC GGGCCGTCGC 

51 CGTGCTATCC GTCATGATTT TCCACCTGAA TAACCGCTGG CTGCCCGGAG 

101 GATTCCTGGG GGTGGACATT TTCTTTGTCA TCTCAGGATT CCTCATTACC 

151 GGCATCATTC TTTCTGAAAT ACAGAACGGT TCTTTTTCTT TCCGGGATTT 

201 TTATACCCGC AGGATTAAGC GGATTTATCC TGCCTTTATT GCGGCCGTGT 

251 CGCTGGCTTC GGTGATTGCC TCTCAAATCT TCCTTTACGA AGATTTCAAC 

3 01 CAAATGCGGA AAACCGTGGA GCTTTCTGCG GTTTTCTTGT CCAATATTTA 

351 TCTGGGGTTT CAGCAGGGGT ATTTCGATTT GAGTGCCGAC GAGAACCCCG 

401 TACTGCATAT CTGGTCTTTG GCAGTAGAGG AACAGTATTA CCTCCTGTAT 

451 CCCCTTTTGC TGATATTTTG CTGCAAAAAA ACCAAATCGC TACGGGTGCT 

501 GCGTAACATC AGCATCATCC TGTTTTTGAT TTTGACTGCC TCATCGTTTT 

551 TGCCAAGCGG GTTTTATACC GACATCCTCA ACCAACCCAA TACTTATTAC 

601 CTTTCGACAC TGAGGTTTCC CGAGCTGTTG GCAGGTTCGC TGCTGGCGGT 

651 TTACGGGCAA ACGCAAAACG GCAGACGGCA AACAGCAAAT GGAAAACGGC 

701 AGTTGCTTTC ATCACTCTGC TTCGGCGCAT TGCTTGCCTG CCTGTTCGTG 

751 ATTGACAAAC ACAATCCGTT TATCCCGGGA ATGACCCTGC TCCTTCCCTG 

801 CCTGCTGACG GCACTGCTTA TCCGGAGTAT GCAATACGGG ACACTTCCGA 

851 CCCGCATCCT GTCGGCAAGC CCCATCGTAT TTGTCGGCAA AATCTCTTAT 

901 TCCCTATACC TGTACCATTG GATTTTTATT GCTTTCGCCC ATTACATTAC 

951 AGGCGACAAA CAGCTCGGAC TGCCTGCCGT ATCGGCGGTT GCCGCGTTGA 

1001 CGGCCGGATT TTCCCTGTTG AGTTATTATT TGATTGAACA GCCGCTTAGA 

1051 AAACGGAAGA TGACCTTCAA AAAGGCATTT TTCTGCCTCT ATCTCGCCCC 

1101 GTCCCTGATA CTTGTCGGTT ACAACCTGTA CGCAAGGGGG ATATTGAAAC 

1151 AGGAACACCT CCGCCCGTTG CCCGGCGCGC CCCTTGCTGC GGAAAATCAT 

1201 TTTCCGGAAA CCGTCCTGAC CCTCGGCGAC TCGCACGCCG GACACCTGAG 

1251 GGGGTTTCTG GATTATGTCG GCAGCCGGGA AGGGTGGAAA GCCAAAATCC 

1301 TGTCCCTCGA TTCGGAGTGT TTGGTTTGGG TAGATGAGAA GCTGGCAGAC 

1351 AACCCGTTAT GTCGAAAATA CCGGGATGAA GTTGAAAAAG CCGAAGCCGT 

14 01 TTTCATTGCC CAATTCTATG ATTTGAGGAT GGGCGGCCAG CCTGTGCCGA 

14 51 GATTTGAAGC GCAATCCTTC CTAATACCCG GGTTCCCAGC CCGATTCAGG 

1501 GAAACCGTCA AAAGGATAGC GGCCGTCAAA CCCGTCTATG TTTTTGCAAA 
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1551 CAACACATCA ATCAGCCGTT CGCCCCTGAG GGAGGAAAAA TTGAAAAGAT 

1601 TTGCCGCAAA CCAATATCTC CGCCCCATTC AGGCTATGGG CGACATCGGC 

1651 AAGAGCAATC AGGCGGTCTT TGATTTGATT AAAGATATTC CCAATGTGCA 

1701 TTGGGTGGAC GCACAAAAAT ACCTGCCCAA AAACACGGTC GAAATATACG 

1751 GCCGCTATCT TTACGGCGAC CAAGACCACC TGACCTATTT CGGTTCTTAT 

1801 TATATGGGGC GGGAATTCCA CAAACACGAA CGCCTGCTTA AATCTTCCCA 

1851 CGGCGGCGCA TTGCAGTAG 

This corresponds to the amino acid sequence [<SEQ ID 830; ORF128-l>] (SEP ID NO: 830; 
PRF128-1) : 



1 MQAVRYRPE I DGLRAVAVLS VMIFHL NNRW LPGGFLG VDI FFVISGFLIT 

51 GIIL SEIQNG SFSFRDFYTR RIKRIYPA FI AAVSLASVIA SQIFL YEDFN 

101 QMRKTVELSA VFLSNIYLGF QQGYFDLSAD ENPVLHIWSL AVEEQYYLLY 

151 PLLLIFCCKK TKSLRVLRN I SIILFLILTA SSFLPS GFYT DILNQPNTYY 

201 LSTLRFPELL AGSLLAVYGQ TQNGRRQTAN GKRQ LLSSLC FGALLACLFV 

251 IDKHNPF IPG MTLLLPCLLT ALLI RSMQYG TLPTRILSAS PIVFVGKISY 

3 01 SLYLYHWIFI AFAHYITGDK QLG LPAVSAV AALTAGFSLL SYYLIEQPLR 

351 KRKMTFKKAF FCLYLAPSLI LVGYNLYARG ILKQEHLRPL PGAPLAAENH 

401 FPETVLTLGD SHAGHLRGFL DYVGSREGWK AKILSLDSEC LVWVDEKLAD 

451 NPLCRKYRDE VEKAEAVFIA QFYDLRMGGQ PVPRFEAQSF LIPGFPARFR 

501 ETVKRIAAVK PVYVFANNTS ISRSPLREEK LKRFAANQYL RPIQAMGDIG 

551 KSNQAVFDLI KDIPNVHWVD AQKYLPKNTV EIYGRYLYGD QDHLTYFGSY 

601 YMGREFHKHE RLLKSSHGGA LQ* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with hypothetical integral membrane protein HI0392 of H .influenzae (accession number 
U32723) (SEP ID NO: 1 164) 

ORF128 (SEP ID NO: 828) and HI0392 fSEO ID NO: 1164) show 52% aa identity in 180aa 
overlap: 



Orf 128 : 


1 


VSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGFQQGYFDLSADENPVLHIWSLAV 


60 






++L S IAS IF+Y DFN++RKT+EL+ FLSN YLG QGYFDLSA+ENPVLHIWSLAV 




HI0392 : 


46 


MALVSFIASAIFIYNDFNKLRKTIELAIAFLSNFYLGLTQGYFDLSANENPVLHIWSLAV 


105 


Orf 128: 


61 


EEQXXXXXXXXXIFCCKKTKSLRVLRNISIILFLILTASSFLPSGFYTDILNQPNTYYLS 


120 






E Q I KK + ++VL I++ILF IL A+SF+ + FY ++L+QPN YYLS 




HI0392 : 


106 


EGQYYLIFPLILILAYKKFREVKVLFIITLILFFILLATSFVSANFYKEVLHQPNIYYLS 


165 


Orf 128: 


121 


TLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLCFGALLACLFVIDKHNPFIPGMT 


180 






LRFPELL GSLLA+Y N + Q + +L+ L L +CLF+++ + FIPG+T 




HI0392 : 


166 


NLRFPELLVGSLLAIYHNLSN-KVQLSKQVNNILAILSTLLLFSCLFLMNNNIAFIPGIT 


224 



Homology with a predicted ORF from N. meningitidis (strain A) 



ORF128 (SEP ID NO: 828) shows 98.0% identity over a 244aa overlap with an ORF (ORF128a) 
(SEP ID NO: 832) from strain A of N. meningitidis: 
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10 20 30 

orf 128 . pep VSLASVI ASQI FLYEDFNQMRKTVELSAVF 

Ml I I I I I I I M I II I I I I I I I I I I I I I I I 
orf 128a I LSEIQNGSFSFRDFYTRRIKRIYPAFIAAVSLASVI ASQI FLYEDFNQMRKTVELSAVF 

60 70 80 90 100 110 

40- 50 60 70 80 90 

orf 128 .pep LSNIYLGFQQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISI 

1 1 1 1 1 1 1 1 1 1 1 1 II I Ml I i II 1 1 M 1 1 1 1 1 1 1 1 1 II M 1 1 1 1 1 II 1 1 1 1 1 1 1 M I II I 

orf 12 8a LSNIYLGFQQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISI 
120 130 140 150 160 170 

100 110 120 130 140 150 

orf 128 .pep ILFLILTASSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGK 

1 1 M 1 1 1 Ml 1 1 1 1 1 II I II 1 1 1 1 1 1 1 1 1 M I! 1 1 1 M I II I II I M I II 1 1 1 1 1 1 1 II 

orf 128a ILFLILTATSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGK 
180 190 200 210 220 230 

160 170 180 190 200 210 

orf 12 8 .pep RQLLSSLCFGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPI 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

orf 12 8a RQLLSSLCFGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPI 
240 250 260 270 280 290 

220 230 240 

orf 12 8 .pep VFVGKI SYS LYLYHW I F I AFAPL I RGGKQLGLP A 

M II 1 1 1 M I M 1 1 1 1 M 1 1 I I MM 

orf 128a VFVGKI SYSLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKR 

300 310 320 330 340 350 

orf 12 8a KMTFKKAFFCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSH 
360 370 380 390 400 410 

The complete length ORF128a nucleotide sequence [<SEQ ID 83 1>] (SEP ID NO: 831) is 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 



ATGCAAGCTG 
CGTGCTATCC 
GATTCCTGGG 
GGCATCATTC 
TTATACCCGC 
CGCTGGCTTC 
CAAATGCGGA 
TCTGGGGTTT 
TACTGCATAT 
CCTCTTTTGC 
GCGTAACATC 
TGCCAAGCGG 
CTTTCGACAC 
TTACGGGCAA 
AGTTGCTTTC 
ATTGACAAAC 
CCTGCTGACG 
CCCGCATCCT 
TCCCTATACC 
AGGCGACAAA 
CGGCCGGATT 
AAACGGAAGA 
GTCCCTGATA 
AGGAACACCT 



TCCGATACAG 
GTCATGATTT 
GGTGGACATT 
TTTCTGAAAT 
AGGATTAAGC 
GGTGATTGCC 
AAACCGTGGA 
CAGCAGGGGT 
CTGGTCTTTG 
TGATATTTTG 
AGCATCATCC 
GTTTTATACC 
TGAGGTTTCC 
ACGCAAAACG 
ATCACTCTGC 
ACAATCCGTT 
GCACTGCTTA 
GTCGGCAAGC 
TGTACCATTG 
CAGCTCGGAC 
TTCCCTGTTG 
TGACCTTCAA 
CTTGTCGGTT 
CCGCCCGTTG 



ACCGGAAATT 
TCCACCTGAA 
TTCTTTGTCA 
ACAGAACGGT 
GGATTTATCC 
TCTCAAATCT 
GCTTTCTGCG 
ATTTCGATTT 
GCAGTAGAGG 
CTGCAAAAAA 
TATTTCTGAT 
GATATTCTCA 
CGAGCTGTTG 
GCAGACGGCA 
TTCGGCGCAT 
TATCCCGGGA 
TCCGGAGTAT 
CCCATCGTAT 
GATTTTTATT 
TGCCTGCCGT 
AGTTATTATT 
AAAGGCATTT 
ACAACCTGTA 
CCCGGCGCGC 



GACGGATTGC 
TAACCGCTGG 
TCTCAGGATT 
TCTTTTTCTT 
TGCTTTTATT 
TCCTTTACGA 
GTTTTCTTGT 
GAGTGCCGAC 
AACAGTATTA 
ACAAAATCGC 
TTTGACTGCC 
ACCAACCCAA 
GCAGGTTCGC 
AACAGCAAAT 
TGCTTGCCTG 
ATGACCCTGC 
GCAATACGGG 
TTGTCGGCAA 
GCTTTCGCCC 
ATCGGCGGTT 
TGATTGAACA 
TTCTGCCTCT 
CGCAAGGGGG 
CCCTTGCTGC 



GGGCCGTCGC 
CTGCCCGGAG 
CCTCATTACC 
TCCGGGATTT 
GCGGCCGTGT 
AGATTTCAAC 
CCAATATTTA 
GAGAACCCCG 
CCTCCTGTAT 
TACGGGTGCT 
ACATCGTTTT 
TACTTATTAC 
TGCTGGCGGT 
GGAAAACGGC 
CCTGTTCGTG 
TCCTTCCCTG 
ACACTTCCGA 
AATCTCTTAT 
ATTACATTAC 
GCCGCGTTGA 
GCCGCTTAGA 
ATCTCGCCCC 
ATATTGAAAC 
GGAAAATCAT 
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1201 TTTCCGGAAA CCGTCCTGAC CCTCGGCGAC TCGCACGCCG GACACCTGCG 

1251 GGGGTTTCTG GATTATGTCG GCAGCCGGGA AGGGTGGAAA GCCAAAATCC 

1301 TGTCCCTCGA TTCGGAGTGT TTGGTTTGGG TAGATGAGAA GCTGGCAGAC 

1351 AACCCGTTAT GTCGAAAATA CCGGGATGAA GTTGAAAAAG CCGAAGCCGT 

5 1401 TTTCATTGCC CAATTCTATG ATTTGAGGAT GGGCGGCCAG CCCGTGCCGA 

1451 GATTTGAAGC GCAATCCTTC CTAATACCCG GGTTCCCAGC CCGATTCAGG 

1501 GAAACCGTCA AAAGGATAGC CGCCGTCAAA CCCGTCTATG TTTTTGCAAA 

1551 CAACACATCA ATCAGCCGTT CGCCCCTGAG GGAGGAAAAA TTGAAAAGAT 

1601 TTGCCGCAAA CCAATATCTC CGCCCCATTC AGGCTATGGG CGACATCGGC 

10 1651 AAGAGCAATC AGGCGGTCTT TGATTTGATT AAAGATATTC CCAATGTGCA 

1701 TTGGGTGGAC GCACAAAAAT ACCTGCCCAA AAACACGGTC GAAATATACG 

1751 GCCGCTATCT TTACGGCGAC CAAGACCACC TGACCTATTT CGGTTCTTAT 

1801 TATATGGGGC GGGAATTTCA CAAACACGAA CGCCTGCTTA AATCTTCTCG 

1851 CGACGGCGCA TTGCAGTAG 



15 



This encodes a protein having amino acid sequence [<SEQ ID 832>] (SEP ID NO: 832) : 



1 MQAVRYRPE I DGLRAVAVLS VMIFHL NNRW LPGGFLG VDI FFVISGFLIT 

51 GIIL SEIQNG SFSFRDFYTR RIKRIYPA FI AAVSLASVIA SQIFL YEDFN 

101 QMRKTVELSA VFLSNIYLGF QQGYFDLSAD ENPVLHIWSL AVEEQYYLLY 

20 151 PLLLIFCCKK TKSLRVLRN I SIILFLILTA TSFLPS GFYT DILNQPNTYY 

201 LSTLRFPELL AGSLLAVYGQ TQNGRRQTAN GKRQ LLSSLC FGALLACLFV 

251 IDKHNPF IPG MTLLLPCLLT ALLI RSMQYG TLPTRILSAS PIVFVGKISY 

301 SLYLYHWIFI AFAHYITGDK QLG LPAVSAV AALTAGFSLL SYYLIEQPLR 

351 KRKMTFKKAF FCLYLAPSLI LVGYNLYARG ILKQEHLRPL PGAPLAAENH 

25 4 01 FPETVLTLGD SHAGHLRGFL DYVGSREGWK AKILSLDSEC LVWVDEKLAD 

451 NPLCRKYRDE VEKAEAVFIA QFYDLRMGGQ PVPRFEAQSF LIPGFPARFR 

501 ETVKRIAAVK PVYVFANNTS ISRSPLREEK LKRFAANQYL RPIQAMGDIG 

551 KSNQAVFDL I KDIPNVHWVD AQKYLPKNTV EIYGRYLYGD QDHLTYFGSY 

601 YMGREFHKHE RLLKSSRDGA LQ* 



30 



ORF128a (SEP ID NO: 832) and ORF128-1 (SEP ID NO: 830) show 99.5% identity in 622 aa 
overlap: 



orf 12 8a. pep MQAWYRPEIDGLRAVAVLSVMIFHLNNRWLPGGFLGVDIFFVISGFLITGI ILSEIQNG 

I I I M I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I . I I I I : I I I I I I I I 

35 orf 128-1 MQ AVR YRP E I DGLRAVAVLS VM I FHLNNRWL PGG FLG VD I F FV I SGFL I TG I I LS E I QNG 

orf 12 8a. pep SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGF 

MM MMIMIMM MMMMMMMMMMMIMM MM MIIMI 

orf 128-1 SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGF 

orf 12 8a. pep QQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNI SIILFLILTA 

40 I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I II II II I I I I I I I I I I I I I I I I I 

orf 128-1 QQGYFDLSADENP VLH I WSLAVEEQYYLLYPLLL I FCCKKTKSLRVLRN I S I I LFL I LTA 

orf 128a . pep TSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLC 

M II 1 1 M 1 1 1 1 1 1 1 1 1 1 I II 1 1 1 II M I II 1 1 1 1 M I M M 1 1 1 1 1 M 1 1 1 1 1 1 1 M 

orf 128-1 SSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLC 
45 orf 12 8a. pep FGALLACLFVIDKHNPF I PGMTLLLPCLLTALL I RSMQYGTLPTR I LSAS PIVFVGKISY 

i I II 1 1 1 1 M 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 : II 1 1 1 M M Ill 

orf 128-1 FGALLACLFVIDKHNPF I PGMTLLLPCLLTALL I RSMQYGTLPTR I LSAS PIVFVGKISY 

orf 128a . pep SLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF 
M I Mill I I I I I I I II I I II I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I I I I 
50 orf 128-1 SLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF 
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orf 12 8a . pep FCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSHAGHLRGFL 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M I M 1 1 ! 1 1 1 M 

orf 12 8-1 FCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSHAGHLRGFL 
orf 12 8a . pep DYVGSREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ 

5 II 1 1 1 1 1 . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 : M 1 1 1 1 1 1 1 

orf 128 - 1 DYVGSREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ ^ 

orf 12 8a. pep PVPRFEAQSFLIPGFPARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRPAANQYL 

I I I I I I I II I I I I I ' I I I II II I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
or f 1 2 8 - 1 PVPRFEAQS FLI PGFPARFRETVKRI AAVKPVYVFANNTS I SRS PLREEKLKRFAANQYL 

10 orf 12 8a. pep RP I QAMGD I GKSNQAVFDL I KD I PNVHWVDAQKYLPKNTVE I YGRYLYGDQDHLTYFGS Y 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I : 
orfl28-l RPI QAMGD I GKSNQAVFDL I KD I PNVHWVDAQKYL P KNTVE I YGRYL YGDQDHLT YFGS Y 

orf 128a . pep YMGREFHKHERLLKSSRDGALQX 

I I I I I I I I I I I I I I h Mill 
15 orfl28-l YMGREFHKHERLLKSSHGGALQX 

Homology with a predicted ORF from N gonorrhoeae 

ORF128 (SEP ID NO: 828) shows 93.4% identity over 244 aa overlap with a predicted ORF 
(ORF128ng) (SEP ID NO: 834) from N. gonorrhoeae: 

or f 1 2 8 . pep VSLASVI ASQ I FLYEDFNQMRKTVELSAVF 3 0 

20 | | | | | | | | | | | | | | | | | | | | | | | : I I I : I I 

orf 128ng ILSEIQNGSFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTIELSTVF . 112 

orf 128 .pep LSN I YLGFQQGYFDLS ADENP VLH I WS LAVEEQYYLLYPLLL I FCCKKTKSLRVLRN I S I 90 

MINIM: 1 1 1 1 1 ; 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MMMMMMM 

orf 12 8ng LSNIYLGFRLGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCYKKTKSLRVLRNISI 172 

25 orf 128 .pep ILFLILTASSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGK 150 

I M I I I I I I I hi I I I I I I I I I M I I I I I I I I I hi I I I I I I I I I I I I I I I Ml 
orf 12 8ng ILFLILTASSFLPAGFYTDILNQPNTYYLSTLRFPELLVGSLLAVYGQTQNGRRQTENGK 232 

orf 128 .pep RQLLSS LCFGALLACLFV I DKHNP F I PGMTLLLPCLLTALL I RSMQYGTLPTRI LS AS P I 210 

Mill MIIMhIllllllhllllhllllllllMIIIIIIIMIIIIIIIIMII 

30 orf 128ng RQLLSLLCFGALLVCLFVIDKHDPFIPGITLLLPCLLTALLIRSMQYGTLPTRILSASPI 292 

orf 12 8 .pep VFVGKI S YSLYLYHWI FI AFAPLI RGGKQLGLPA 244 

1 1 1 1 II I h II 1 1 1 1 1 1 1 1 1 I I I M 1 1 1 1 

orf 128ng VFVGKISYSLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKR 352 

35 The complete length PRF1 28ng nucleotide sequence [<SEQ ID 833>] (SEP ID NP: 833) is: 



1 ATGCAAGCTG TCCGATACAG GCCTGAAATT GACGGATTGC GGGCCGTCGC 

51 CGTGCTATCC GTCATTATTT TCCACCTGAA TAACCGCTGG CTGCCCGGAG 

101 GATTCCTGGG GGTGGACATT TTCTTTGTCA TCTCGGGATT CCTCATTACC 

151 AACATCATTC TTTCTGAAAT ACAGAACGGT TCTTTTTCTT TCCGGGATTT 

40 201 TTATACCCGC AGGATTAAGC GGATTTATCC TGCTTTTATT GCGGCCGTGT 

251 CCCTGGCTTC GGTGATTGCT TCTCAAATCT TCCTTTACGA AGATTTCAAC 

301 CAAATGAGGA AAACCATAGA GCTTTCTACG GTTTTTTTGT CCAATATTTA 

351 TTTGGGGTTC CGATTGGGGT ATTTCGATTT GAGTGCCGAC GAGAACCCCG 
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401 TACTGCATAT CTGGTCTTTG GCGGTAGAGG AACAGTATTA CCTCCTGTAT 

451 CCTCTTTTGC TGATATTCTG TTACAAAAAA ACCAAATCAC TACGGGTGCT 

501 GCGTAATATC AGCATCATCC TGTTTCTGAT TTTGACCGCA TCATCGTTTT 

551 TGCCGGCCGG GTTTTATACC GACATCCTCA ACCAACCcaa TACTTATTAC 

5 601 CTTTCGACAC TGAGGTTTCC CGAGCTGTTG GTGGGTTCGC TGTTGGCGGT 

651 TTACGGGCAA ACGCAAAACG GCAGACGGCA AACAGAAAAT GGAAAACGGC 

701 AGTTGCTTTC ATTACTCTGT TTCGGCGCat tgCTTGTCTG CCTGTTCGTG 

751 ATCGACAAAC ACGATCCGTT TATCCCGGGA ATAACCCTGC TCCTTCCCTG 

801 CCTGCTGACG GCGCTGCTTA TCCGGAGTAT GCAATACGGG ACACTTCCGA 

10 851 CCCGCATCCT GTCGGCAAGC CCCATCGTAT TTGTCGGCAA AATCTCTTAT 

901 TCCCTATACC TGTACCATTG GATTTTTATT GCCTTCGCCC ATTACATTAC 

951 AGGCGACAAA CAGCTCGGAC TGCCTGCCGT ATCGGCGGTT GCCGCGTTGA 

1001 CGGCCGGATT TTCCCTGTTG AGCTATTATT TGATTGAACA GCCGCTTAGA 

1051 AAACGGAAGA TGACCTTCAA AAAGGCATTT TTCTGCCTTT ATCTCGCCCC 

15 1101 GTCCCTGATG CTTGTCGGTT ACAACCTGTA TTCAAGAGGG ATATTGAAAC 

1151 AGGAACACCT CCGCCCGCTG CCCGGCACGC CCGTTGCTGC GGAAAATAAT 

1201 TTTCCGGAAA CCGTCTTGAC CCTCGGCGAC TCGCACGCCG GACACCTGCG 

1251 GGGGTTTCTG GATTATGTCG GCGGCAGGGA AGGGTGGAAA GCTAAAATCC 

1301 TGTCCCTCGA TTCGGAGTGT TTGGTTTGGG TGGATGAGAA GCTGGCAGAC 

20 1351 AACCCGTTGT GCCGAAAATA CCGGGATGAA GTTGAAAAAG CCGAAGCTGT 

1401 TTTCATTGCC CAATTCTATG ATTTGAGGAT GGGCGGCCAG CCCGTGCCGA 

14 51 GATTTGAAGC GCAATCCTTC CTGATACCCG GGTTCAAAGC CCGATTCAGG 

1501 GAAACCGTCA AGAGGATAGC CGCCGTCAAA CCTGTATATG TTTTTGCAAA 

1551 CAATACATCA ATCAGCCGTT CTCCCTTGAG GGAGGAAAAA TTGAAAAGAT 

25 1601 TTGCTATAAA CCAATACCTC CGGCCTATTC GGGCTATGGG CGACATCGGC 

1651 AAGAGCAATC AGGCGGTCTT TGATTTGGTT AAAGATATTC CCAATGTGCA 

1701 TTGGGTGGAC GCACAAAAAT ACCTGCCCAA AAACACGGTC GAAATACACG 

1751 GACGCTATCT TTACGGCGAC CAAGACCACC TGACCTATTT CGGTTCTTAT 

1801 TATATGGGGC GGGAATTTCA CAAACACGAA CGCCTGCTCA AGCATTCCCG 

30 1851 AGGCGGCGCA TTGCAGTAG 

This encodes a protein having amino acid sequence [<SEQ ID 834>] (SEP ID NO: 834) : 

1 MQAVRYRPE I DGLRAVAVLS VI IFHLNNRW LPGGFLGVDI FFVISGFLIT 

51 NIIL SEIQNG SFSFRDFYTR RIKRIYPA FI AAVSLASVIA SQIFL YEDFN 

35 101 QMRKTIELST VFLSNIYLGF RLGYFDLSAD ENPVLHIWSL AVEEQYYLLY 

151 PLLLIFCYKK TKSLRVLRN I SIILFLILTA SSFLPA GFYT DILNQPNTYY 

201 LSTLRFPELL VGSLLAVYGQ TQNGRRQTEN GKRQ LLSLLC FGALLVCLFV 

251 IDKHDPF IPG ITLLLPCLLT ALLI RSMQYG TLPTRILSAS PIVFVGKISY 

3 01 SLYLYHWIFI AFAHYITGDK QLG LPAVSAV AALTAGFSLL SYYLIEQPLR 

40 351 KRKMTFKKAF FCLYLAPSLM LVGYNLYSRG ILKQEHLRPL PGTPVAAENN 

401 FPETVLTLGD SHAGHLRGFL DYVGGREGWK AKILSLDSEC LVWVDEKLAD 

451 NPLCRKYRDE VEKAEAVFIA QFYDLRMGGQ PVPRFEAQSF LIPGFKARFR 

501 ETVKRIAAVK PVYVFANNTS ISRSPLREEK LKRFAINQYL RPIRAMGDIG 

551 KSNQAVFDLV KDIPNVHWVD AQKYLPKNTV EIHGRYLYGD QDHLTYFGSY 

45 601 YMGREFHKHE RLLKHSRGGA LQ* 

ORF128ng (SEP ID NO: 834) and ORF128-1 fSEO ID NO: 830) show 95.7% identity in 622 aa 
overlap: 

orf 12 8-1. pep MQAVRYRPEIDGLRAVAVLSVMIFHLNNRWLPGGFLGVDIFFVISGFLITGIILSEIQNG 

50 | | | | | | | | | | | | | | | | | | | M = I I I I I I I II I I I I I I I I I I I I I I I I I I I = I I I I I I I I I 

orf 128ng MQAVRYRPEIDGLRAVAVLSVIIFHLNNRWLPGGFLGVDIFFVISGFLITNIILSEIQNG 

orf 128-1 .pep SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGF 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I h I I I = I I I I I I I I I I 
orf 128ng SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTIELSTVFLSNIYLGF 



-593- 
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10 



15 



20 



25 



orf 128-1 .pep QQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLI FCCKKTKSLRVLRNIS I ILFLILTA 

: 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 I I I I : I I I IIIIIMMIMIIIIIIIIII 

orf 128ng RLGYFDLSADENPVLHIWSIAVEEQYYLLYPLLLIFCYKKTKSLRVLRNIS I ILFLILTA 

orf 128-1 .pep SSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLC 

MMIIIIIIIIII IIIIIIIIIMM IMIIIIIMIIMi 1 1 1 1 1 1 1 1 II 

orf 128ng SSFLPAGFYTDILNQPNTYYLSTLRFPELLVGSLLAVYGQTQNGRRQTENGKRQLLSLLC 

orf 128-1 .pep FGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPIVFVGKISY 

I I I I I : I I I I I I I I : I II I I : I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
orf 128ng FGALLVCLFVIDKHDPFIPGITLLLPCLLTALLIRSMQYGTLPTRILSASPIVFVGKISY 

orf 12 8 - 1 . pep SLYLYHWI FIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF 

II i I i 1 1 1 1 1 1 1 [ 1 1 1 1 1 1 1 i ' 1 1 1 1 1 [ 1 1 ! 1 1 1 1 1 1 1 1 1 1 

orf 12 8ng SLYLYHWI FIAFAHY I TGDKQLGLPAVSAVAALTAGFSLLSYYL I EQPLRKRKMTFKKAF 

orf 128-1 .pep FCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSHAGHLRGFL 
I ! i I I I I I :| M I I II: I I I I I I I I I I M I :h I I I h I I I I I I I M M I I M I I I I i 
orf 128ng FCLYLAPSLMLVGYNLYSRGILKQEHLRPLPGTPVAAENNFPETVLTLGDSHAGHLRGFL 

orf 128-1 .pep DYVGSREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ 

I I I I : I I I f I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I 
or f 12 8ng DYVGGREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEPCAEAVFIAQFYDLRMGGQ 

orf 128-1 .pep PVPRFEAQSFLIPGFPARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRFAANQYL 

IIIIIIIIIMIIM MINIM Mill II MINIM III MINI Ml II II MM 

orf 12 8ng PVPRFEAQSFLIPGFKARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRFAINQYL 

orf 128-1 .pep RPIQAMGDIGKSNQAVFDLIKDIPNVHWVDAQKYLPKNTVEIYGRYLYGDQDHLTYFGSY 

I I I : I I I I I I I I I I I I I I h I I I I I I I I M I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I 
orf 128ng RPIRAMGDIGKSNQAVFDLVKDIPNVHWVDAQKYLPKNTVEIHGRYLYGDQDHLTYFGSY 

or f 12 8 - 1 . pep YMGREFHKHERLLKSSHGGALQX 

I MINIUM hllllll 
or f 12 8ng YMGREFHKHERLLKHSRGGALQX 

610 620 



30 In addition, [ORF218ng] ORF128ng (SEP ID NO: 834) shows homology to a hypothetical 
Kinfluenzae protein (SEP ID NO: 1164) : 



35 



sp| P43 993 | Y3 92_HAEIN HYPOTHETICAL PROTEIN HI0392 ) gi | 1074385 | pir | | B64007 
hypothetical protein HI03 92 - Haemophilus influenzae (strain Rd KW20) 
)gi | 1573364 (U32723) H. influenzae predicted coding region HI0392 [Haemophilus 
influenzae] Length = 245 
Score = 239 bits (604), Expect = 3e-62 

Identities = 124/225 (55%), Positives = 152/225 (67%), Gaps = 1/225 (0%) 



40 



45 



Query : 


38 


Sbjct : 


1 


Query: 


98 


Sbjct: 


61 


Query : 


158 



VDIFFVISGFLITNIILSEIQNGSFSFRDFYTRRIKRIYPXXXXXXXXXXXXXXXXFLYE 97 
+DIFFVISGFLIT II++EIQ SFS + FYTRRIKRIYP F+Y 
MDIFFVISGFLITGIIITEIQQNSFSLKQFYTRRIKRIYPAFITVMALVSFIASAIFIYN 60 



DFN++RKTIEL+ FLSN YLG GYFDLS A+ ENPVLH I WS LAVE Q 



YKK + ++VL I++ILF IL A+SF+ A FY ++L+QPN YYLS LRFPELLVGSLLA+ 
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Sbjct: 121 YKKFREVKVLFIITLILFFILLATSFVSANFYKEVLHQPNIYYLSNLRFPELLVGSLLAI 180 

Query: 218 YGQTQNGRRQTENGKRQLLSLLCFGALLVCLFVIDKHDPFIPGIT 262 

Y N + Q +L++L L CLF+++ + FIPGIT 

Sbjct: 181 YHNLSN-KVQLSKQVNNILAILSTLLLFSCLFLMNNNIAFIPGIT 224 

This analysis, including the identification of several putative transmembrane domains, suggests 
that these proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful 
antigens for vaccines or diagnostics, or for raising antibodies. 

Example 99 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 835>] (SEP ID 
NO: 835) : 

1 . . ATTATTTACG AATACCGCTG GATGTTTCTT TACGGCGCAC TGACGACCTT 

51 GGGGCTGACG GTCGTGGCAA C . GCGGGCGG TTCGGTATTG GGTCTGTTGT 

101 TGGCGTTGGC GCGCCTGATT CACTTGGAAA AAGCCGGTGC GCCGATGCGC 

151 GTGCTGGCGT GGGCGTTGCG TAAAGTTTCG CTGCTGTATG TTACGCTGTT 

201 CCGGGGTACG CCGCTGTTTG TGCAGATTGT GATTTGGGCG TATGTGTGGT 

251 TTCCGTTTTT CGTC . . 

This corresponds to the amino acid sequence [<SEQ ID 836; ORF129>] fSEO ID NO: 836: 
ORF129): 



1 . . IIYEYRWMFL YGALTTLGLT WAXAGGSVL GLLLALARLI HLEKAGAPMR 
51 VLAWALRKVS LLYVTLFRGT PLFVQIVIWA YVWFPFFV. . 



Further work revealed the complete nucleotide sequence [<SEQ ID 837>] (SEP ID NO: 837) : 



1 ATGGATTTTC GTTTTGACAT TATTTACGAA TACCGCTGGA TGTTTCTTTA 

51 CGGCGCACTG ACGACCTTGG GGCTGACGGT CGTGGCAACG GCGGGCGGTT 

101 CGGTATTGGG TCTGTTGTTG GCGTTGGCGC GCCTGATTCA CTTGGAAAAA 

151 GCCGGTGCGC CGATGCGCGT GCTGGCGTGG GCGTTGCGTA AAGTTTCGCT 

201 GCTGTATGTT ACGCTGTTCC GGGGTACGCC GCTGTTTGTG CAGATTGTGA 

251 TTTGGGCGTA TGTGTGGTTT CCGTTTTTCG TCCATCCTTC AGACGGCATT 

3 01 TTGGTCAGCG GCGAGGCGGC AATCGCGCTG CGTCGCGGAT ACGGGCCGCT 
351 GATTGCCGGT TCTTTGGCAC TGATCGCCAA CTCGGGGGCG TATATCTGTG 

4 01 AGATTTTCCG CGCGGGCATC CAGTCTATAG ACAAAGGACA GATGGAGGCG 
4 51 GCGCGTTCTT TGGGGCTGAC CTATCCGCAG GCGATGCGCT ATGTGATTCT 
501 GCCGCAGGCA TTGCGCCGCA TGCTGCCGCC TTTGGCGAGC GAGTTCATCA 
551 CGCTCTTGAA AGACAGCTCG CTGCTGTCGG TCATTGCTGT GGCGGAGTTG 
601 GCGTATGTTC AGAATACGAT TACGGGCCGG TATTCGGTTT ATGAAGAACC 
651 GCTTTACACC GTCGCCCTGA TTTATCTGTT GATGACGACT TTCTTAGGCT 
701 GGATATTCCT GCGTTTGGAA AAACGTTACA ATCCGCAACA CCGCTGA 



This corresponds to the amino acid sequence [<SEQ ID 838; ORF129-l>] (SEP ID NO: 838; 
ORF129-1) : 



1 MDFRFDIIYE YRWMFLYGAL TTLGLTWAT AGGSVLGLLL ALARLIHLEK 
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51 AGAPMRVLAW ALRKVSLLYV TLFRGTP LFV QIVIWAYVWF PFFVH PSDGI 

101 LVSGEAAIAL RRGYGP LIAG SLALIANSGA Y IC E I FRAG I QSIDKGQMEA 

151 ARSLGLTYPQ AMRYVILPQA LRRMLPPLAS E FITLLKDSS LLSVIAVA EL 

2 01 AYVQNTITGR YSVYEEPLYT VALIYLLMTT FLGWIF LRLE KRYNPQHR* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N.meningitidis (strain A) 



ORF129 (SEP ID NO: 836) shows 98.9% identity over a 88aa overlap with an ORF (ORF129a) 
(SEP ID NO: 840) from strain A of N. meningitidis: 



10 20 30 40 50 

IIYEYRWMFLYGALTTLGLT WAXAGGSVLGLLLALAR LIHLEKAGAPMRVLAW 

II 1 1 1 M 1 1 1 1 I I 1 1 I 1 1 1 1 1 M 1 1 I 1 1 I M 1 1 1 1 M 1 1 1 1 M 1 1 I I II i 

MD FRFD 1 1 YEYRWMFLYGALTTLGLT WATAGGSVLGLLLALAR L I HLEKAGAPMRVLAW 
10 20 30 40 50 60 



orf 12 9 .pep 
orf 129a 



60 70 80 

orf 129. pep ALRKVSLLYVTLFRGTP LFVQIVIWAYVWFPFFV 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 
orf 12 9a ALRKVSLLYVTLFRGTP LFVQIVIWAYVWFPFFVH PSDGILVSGEAAIALRRGYGP LIAG 

70 80 90 100 110 120 

orf 12 9a SLALIANSGAY I C E I FRAGIQS IDKGQMEAARSLGLTYPQAMRYVI LPQALRRMLPPLiAS 

130 140 150 160 170 180 



The complete length PRF129a nucleotide sequence [<SEQ ID 839>] (SEPIDNP: 839) is: 



1 ATGGATTTTC GTTTTGACAT TATTTACGAA TACCGCTGGA TGTTTCTTTA 

51 CGGCGCACTG ACGACCTTGG GGCTGACGGT CGTGGCGACG GCGGGCGGTT 

101 CGGTATTGGG TCTGTTGTTG GCGTTGGCGC GCCTGATTCA CTTGGAAAAA 

151 GCCGGTGCGC CGATGCGCGT GCTGGCGTGG GCGTTGCGTA AGGTTTCGCT 

201 GCTGTATGTT ACGCTGTTCC GGGGTACGCC GCTGTTTGTG CAGATTGTGA 

251 TTTGGGCGTA TGTGTGGTTT CCGTTTTTCG TCCATCCTTC AGACGGCATT 

3 01 TTGGTTAGCG GCGAGGCGGC AATCGCGCTG CGTCGCGGAT ACGGGCCGCT 
351 GATTGCCGGT TCTTTGGCAC TGATCGCCAA CTCGGGGGCG TATATCTGTG 

4 01 AGATTTTCCG CGCGGGCATC CAGTCTATAG ACAAAGGACA GATGGAGGCG 
4 51 GCGCGTTCTT TGGGGCTGAC CTATCCGCAG GCGATGCGCT ATGTGATTCT 
501 GCCGCAGGCA TTGCGCCGTA TGCTGCCGCC TTTGGCGAGC GAGTTCATCA 
551 CGCTCTTGAA AGACAGCTCG CTGCTGTCGG TCATTGCTGT GGCGGAGTTG 
601 GCGTATGTTC AGAATACGAT TACGGGCCGG TATTCGGTTT ATGAAGAACC 
651 GCTTTACACC GTCGCCCTGA TTTATCTGTT GATGACGACT TTCTTAGGCT 
701 GGATATTCCT GCGTTTGGAA AAACGTTACA ATCCGCAACA CCGCTGA 

This encodes a protein having amino acid sequence [<SEQ ID 840>] (SEP ID NP: 840) : 



1 MDFRFDI IYE YRWMFLYGAL TTLGLT WAT AGGSVLGLLL ALA RLIHLEK 

51 AGAPMRVLAW ALRKVSLLYV TLFRGTP LFV QIVIWAYVWF PFFVH PSDGI 

101 LVSGEAAIAL RRGYGP LIAG SLALIANSGA YIC EIFRAGI QSIDKGQMEA 

151 ARSLGLTYPQ AMRYVILPQA LRRMLPPLAS E FITLLKDSS LLSVIAVA EL 

2 01 AYVQNTITGR YSVYEEPLYT VALIYLLMTT FLGWIFL RLE KRYNPQHR* 
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ORF129a (SEP ID NO: 840) and ORF129-1 (SEP ID NO: 838) show 100.0% identity in 248 aa 
overlap: 

orf 129a . pep MDFRFDIIYEYRWMFLYGALTTLGLTWATAGGSVLGLLLALARLIHLEKAGAPMRVLAW 

Mllllll III IIMIII III IIIIIIIIMMIIIIIIIIMIMIIMIMIIIIIII 

5 orf 129-1 MDFRFDIIYEYRWMFLYGALTTLGLTWATAGGSVLGLLLALARLIHLEKAGAPMRVLAW 

orf 12 9a . pep ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAG 

Mil IIIMIIIIIIIIMII IIIIIIIIIIIIIIIM IIIIIIIIMIIIIIIIIIIII 

orf 12 9-1 ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAG 
orf 12 9a. pep SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS 

10 , 1 1 1 1 1 Mh 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M ' I II I! 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 

orf 129-1 SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS 
orf 12 9a. pep EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTVALIYLLMTTFLGWIFLRLE 

II MMMMMMMMMMIIMM MM MMMIMMMIMMIMM 

orf 12 9-1 EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTVALIYLLMTTFLGWIFLRLE 
15 orf 12 9a. pep KRYNPQHRX 

Ml MM 

orf 129-1 KRYNPQHRX 

Homology with a predicted ORF from N. gonorrhoeae 

ORF129 (SEP ID NO: 836) shows 98.9% identity over a 88 aa overlap with a predicted ORF 
20 (PRF129ng) (SEP ID NO: 842) from N. gonorrhoeae: 

orf 12 9 .pep IIYEYRWMFLYGALTTLGLTWAXAGGSVLGLLLALARLIHLEKAGAPMRVLAW 54 

I M 1 1 1 M 1 1 1 1 i 1 1 i 1 1 1 1 M M I i II M MM 1 1 1 1 1 1 1 M M 1 1 M M 

or f 1 2 9ng MDFRFD 1 1 YE YRWMFLYGALTTLGLTWATAGGS VLGLLLALARL IHLEKAGAPMRVLAW 6 0 

orf 129 .pep ALRKVS LL YVTL FRGT P L FVQ I V I W A YVWF P FFV 88 

25 | | | | | | | | | | | | M I I I I I I I I I I I I I I I I I M I 

orf 12 9ng ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVILHTAFLGNAMRQSRRVPDKGRWIAG 120 

An PRF129ng nucleotide sequence [<SEQ ID 84 1>] (SEP ID NP: 841) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 842>] (SEP ID NP: 842) : 

30 1 MDFRFDIIYE YRWMFLYGAL TTLGLT WAT AGGSVLGLLL ALA RLIHLEK 

51 AGAPMRVLAW ALRKVS LLYV TLFRGTPLF V QIVIWAYVWF PFFVIL HTAF 

101 LGNAMRQSRR VPDKGRWIAG SLELNCQPRG RKTRGEFPPG ESNLGTEPRN 

151 PLSMGQRRFP GCENWYPPQN FIKK* 

35 Further work revealed the following gonococcal sequence [<SEQ ID 843>] (SEP ID NP: 843) : 

1 ATGGATTTTc gtTTTGACAT TATTTAcgaA TACCGCTGGA TGTTTCTTTA 

51 CGGCGCACTG Acgaccttgg ggctgacggt cgtggcgacg gCGGGCGGTT 

101 CGGtattggG TCTGTTGTTG GCGTTGGCGC GCCTGATTCA CTTGGAAAAA 

151 GCCGGTGCGC CGATGCGCGT GCTGGCGTGG GCGTTGCGTA AGGTTTCGCT 

40 201 GCTGTACGTT ACCCTGTTCC GGGGTACGCC GCTGTTTGTG CAGATTGTGA 
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251 TTTGGGCGTA TGTGTGGTTT CCGTTTTTCG TCCATCCTTC AGACGGCATT 

301 TTGGTCAGCG GCGAGGCGGC AATCGCGCTG CGTCGCGGAT ACGGGCCGCT 

351 GATTGCCGGT TCTTTGGCAC TGATCGCCAA CTCGGGGGCG TATATCTGTG 

4 01 AGATTTTCCG CGCGGGCATC CAGTCTATAG ACAAAGGACA GATGGAGGCG 

451 GCGTGTTCTT TGGGACTGAC CTATCCGCAG GCGATGCGCT ATGTGATTCT 

501 GCCGCAGGCA TTGCGCCGTA TGCTGCCGCC TTTGGCGAGC GAGTTCATCA 

551 CGCTCTTGAA AGACAGCTCG CTGCTGTCGG TCATTGCTGT GGCGGAGTTG 

601 GCGTATGTTC AGAATACGAT TACGGGCCGG TATTCGGTTT ATGAAGAACC 

651 GCTTTACACC GCCGCCCTGA TTTATCTGTT GATGACGACT TTCTTAGGCT 

701 GGATATTCCT GCGTTTGGAA AAACGTTACA ATCCGCAACA CCGCTGA 

This corresponds to the amino acid sequence [<SEQ ID 844; ORF129ng-l>] (SEP ID NO: 844: 
PRF129ng-l) : 



1 MDFRFDIIYE YRWMFLYGAL TTLGLT WAT AGGSVLGLLL ALA RLIHLEK 

■51 AGAPMRVLAW ALRKVSLLYV TLFRGTP LFV QIVIWAYVWF PFFVH PSDGI 

101 LVSGEAAIAL RRGYGP LIAG SLALIANSGA YIC EIFRAGI QSIDKGQMEA 

151 ARSLGLTYPQ AMRYVILPQA LRRMLPPLAS E FITLLKDSS LLSVIAVA EL 

201 AYVQNTITGR YSVYEEPLYT VALIYLLMTT FLGWIF LRLE KRYNPQHR* 

ORF129ng-l (SEP ID NO: 844) and ORF129-1 (SEP ID NO: 838) show 99.2% identity in 248 aa 
overlap: 



orf 12 9 - 1 . pep MDFRFDI I YEYRWMFLYGALTTLGLTWATAGGSVLGLLLAIiARLIHLEKAGAPMRVLAW 

I 1 1 1 1 1 1 1 1 1 U M M M II 1 1 , 1 1 M 1 1 1 1 1 1 1 1 H II I 1 1 M I 1 1 1 1 ! 1 1 1 1 I 

orf 12 9ng-l MDFRFDIIYEYRWMFLYGALTTLGLTWATAGGSVLGLLLALARLIHLEKAGAPMRVLAW 
orf 129-1 .pep ALRKVSLLYVTLFRGTPLFVQ I VI WAYVWFPFFVHPSDGI LVSGEAAI ALRRGYGPL I AG 

IIIIII.IIIMIIIIIIIIIIIII IIIIMIIIIIII MINI lllllilllllll 

orf 129ng-l ALRKVSLLYVTLFRGTPLFVQ I VI WAYVWFPFFVHPSDGI LVSGEAAI ALRRGYGPL I AG 
orf 129-1 .pep SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS 

I I I I I I 1 1 I M I I I I I I : 1 1 I I 1 1 1 1 1 1 I 1 1 1 1 I I I 1 1 1 1 1 1 ! I I I I II I I I 1 1 I 

orf 129ng-l SLALIANSGAYICEIFRAGIQSIDKGQMEAACSLGLTYPQAMRYVILPQALRRMLPPLAS 

orf 12 9-1 .pep EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTVALIYLLMTTFLGWIFLRLE 

llllllllllllllllll IIMIIMIIIII llllll :|IIM lllllilllllll 
orf 12 9ng-l EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTAALIYLLMTTFLGWIFLRLE 

orf 12 9-1. pep KRYNPQHRX 

lllllllll 
orfl29ng~l KRYNPQHRX 

In addition, PRF129ng-l (SEP ID NP: 844) is homologous to an ABC transporter (SEP ID NP: 
1 165) from A.fulgidus: 



2650409 (AE001090) glutamine ABC transporter, permease protein (glnP) [Archaeoglobus 
fulgidus] Length = 224 
Score = 132 bits (329) , Expect = 2e-30 

Identities = 86/178 (48%), Positives = 103/178 (57%), Gaps = 18/178 (10%) 



Query : 65 VSLLYVTLFRGTPLFVQ I VI WAYVWFPFFVHPSDGI LVSGEAAI ALRRGYGPL I AGS LAL 124 

+S YV + RGTPL VQI+I +F P+ GI + E A G +AL 

Sbjct: 58 I S TAYVEV I RGTPLLVQ I L I VYFGLPAIGINLQPEPA GIIAL 99 
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Query: 125 IANSGAYICEIFRAGIQSIDKGQMEAACSLGLTYPQAMRYVILPQALRRMLPPLASEFIT 184 

SGAYI EI RAGI+SI GQMEAA SLG+TY QAMRYVI PQA R +LP L +EFI 
Sbjct: 100 SICSGAYIAEIVRAGIESIPIGQMEAARSLGMTYLQAMRYVIFPQAFRNILPALGNEFIA 159 

Query: 185 LLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTAALIYLLMTTFLGWIFLRLEKR 242 

LLKDSSLLSVI++ EL V I P AL YL+MT L + +K+ 

Sbjct: 160 LLKDSSLLSVISIVELTRVGRQIVNTTFNAWTPFLGVALFYLMMTIPLSRLVAYSQKK 217 

This analysis, including the identification of transmembrane domains in the two proteins, suggests 
that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful 
antigens for vaccines or diagnostics, or for raising antibodies. 



Example 100 



The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 845>] (SEP ID 
NO: 845) : 



1 . . CTGAAAGAAT GCCGTCTGAA AGACCCTGTT TTTATTCCAA ATATCGTTTA 

51 TAAGAACATC GCCATTACTT TCCTGCTCTT GCACGCCGCC GCCGAACTTT 

101 GGCTGCCCGC GCAAACCGCC GGTTTTACCG CGCTCGCCGT CGGCTTCATC 

151 CTGCTCGCCA AGCTGCGTGA gCTTCACCAT CACGAACTCT TACGTAAACA 

2 01 CTACGTCCGC ACTTATTACy TGCTCCAACT CTTTGCCGCC GCAGgcTAgT 
251 TTGTGGACAG GCGCGGCGwA ATTACAAAAC CTGCCCGCyT CCGCGCCCCT 

3 01 GCACCTGATT ACCCTCGGCG GCATGATGGG CGGCGTGATG ATGGTGTGGc 
351 TGACCGCCGG ACTGTGGCAC AGCGGCTTTA CCAAACTCGA CTACCCCAAA 

4 01 CTCTGCCGCA TTGCCGTCCC CATCCTTTTC GCCGCCGCCG TCTCGCGCGC 
4 51 TTTCTTGrTG AACGTGAACC CGrTATTTTT CATTACCGTT CCTGCGATTC 
501 TGACCGCCGC CGTATTCGTA CTGTATCTTT TCrCGTTTAT ACCGATATTT 
551 CGGGCGAATG CGTTTACAGA CGATCCGGAr Tar 

This corresponds to the amino acid sequence [<SEQ ID 846; ORF130>] (SEP ID NO: 846; 
ORF130) : 



1 . . LKECRLKDPV FIPNIVYKNI 

51 LLAKLRELHH HELLRKHYVR 

101 HLITLGGMMG GVMMVWLTAG 

151 FLXNVNPXFF ITVPAILTAA 



AITFLLLHAA AELWLPAQTA GFTALAVGFI 
TYYLLQLFAA AGSLWTGAAX LQNLPASAPL 
LWHSGFTKLD YPKLCRIAVP ILFAAAVSRA 
VFVLYLFXFI P I FRANAFTD DPE* 



Further work revealed the complete nucleotide sequence [<SEQ ID 847>] (SEP ID NO: 847) : 



1 ATGCGGCCGT ■ TTTTCGTCGG CGCGGCGGTG CTTGCCATAC TCGGTGCGCT 

51 GGTGTTTTTC ATCAACCCCG GTGCCATCGT CCTGCACCGC CAAATTTTCT 

101 TGGAACTTAT GCTGCCGGCG GCATACGGCG GTTTTTTGAC TGCGGCTTTG 

151 TTGGACTGGA CGGGTTTTTC GGGTAACCTG AAACCTGTCG CGACTTTGAT 

2 01 GGCGGCATTA TTGCTCGCCG CATCCGCTAT ACTGCCCTTT TCGCCGCAAA 

2 51 CTGCCTCGTT TTTCGTCGCC GCCTATTGGC TGGTGTTGCT GCTGTTCTGC 

301 GCCCGGCTGA TTTGGCTAGA CCGAAACACC GACAACTTCG CCCTGCTAAT 

351 GTTACTTGCC GCGTTCACTG TTTTTCAGAC GGCATATGCC GTCAGCGGCG 

4 01 ATTTGAACCT GTTGCGCGCG CAAGTGCATC TAAATATGGC GGCGGTGATG 

4 51 TTCGTATCCG TGCGCGTCAG TATTCTTTTG GGCGCGGAAG CCCTGAAAGA 

501 ATGCCGTCTG AAAGACCCTG TTTTTATTCC AAATATCGTT TATAAAAACA 
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551 TCGCCATTAC TTTCCTGCTC TTGCACGCCG CCGCCGAACT TTGGCTGCCC 

601 GCGCAAACCG CCGGTTTTAC CGCGCTCGCC GTCGGCTTCA TCCTGCTCGC 

651 CAAGCTGCGT GAGCTTCACC ATCACGAACT CTTACGTAAA CACTACGTCC 

701 GCACTTATTA CCTGCTCCAA CTCTTTGCCG CCGCAGGCTA TTTGTGGACA 

751 GGCGCGGCGA AATTACAAAA CCTGCCCGCC TCCGCGCCCC TGCACCTGAT 

801 TACCCTCGGC GGCATGATGG GCGGCGTGAT GATGGTGTGG CTGACCGCCG 

851 GACTGTGGCA CAGCGGCTTT ACCAAACTCG ACTACCCCAA ACTCTGCCGC 

901 ATTGCCGTCC CCATCCTTTT CGCCGCCGCC GTCTCGCGCG CTTTCTTGAT 

951 GAACGTGAAC CCGATATTTT TCATTACCGT TCCTGCGATT CTGACCGCCG 

1001 CCGTATTCGT ACTGTATCTT TTCACGTTTA TACCGATATT TCGGGCGAAT 

1051 GCGTTTACAG ACGATCCGGA ATAA 

This corresponds to the amino acid sequence [<SEQ ID 848; ORF130-1>] (SEP ID NO: 848; 
ORF130-n : 



1 MRPFFVGAAV LAILGALVFF INPGAIVLHR QIFLELMLPA AYGGFLTAAL 

51 LDWTGFSGNL KPVATLMAAL LLAASAILPF SPQTASFFVA AYWLVLLLFC 

101 ARLIWLDRNT DNF ALLMLLA AFTVFQTAYA V SGDLNLLRA QVHLN MAAVM 

151 FVSVRVSILL GA EALKECRL KDPVFIPNIV YKN IAITFLL LHAAAELWLP 

201 AQ TAGFTALA VGFILLAKL R ELHHHELLRK HYVRTYYLLQ LFAAAGYLWT 

251 GAAKLQNLPA SAPLH LITLG GMMGGVMMVW LTA GLWHSGF TKLDYPKLCR 

301 IAVPILFAAA VSRAFLMNVN PIFFITVPAI LTAAVFVLYL FTFIPIFRAN 

351 AFTDDPE* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted ORF from N.meningitidis (strain A) 



ORF130 (SEP ID NO: 846) shows 94.3% identity over a 193aa overlap with an ORF (ORF130a) 
(SEP ID NO: 850) from strain A of N. meningitidis: 



10 20 30 

orf 130 .pep LKECRLKDPVFI PNI VYKNI AITFLLLHAA 

I I I I M I I I i I i I M II I I I I I I M I I I 
orf 130a LNLLRAQVHLNMAAVMFVSVRVS I LLGAEALKSCRLKDPVF I PNVVYKN I AITFLLLHAA 

140 150 160 170 180 190 

40 50 60 70 80 90 

orf 130. pep AELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQLFAAAGSLWTGAAX 

1 1 1 1 1 1 1 1 1 1 1 1 hi M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 , M II M I MINI 

orf 130a AELWLPAQTAGFTSLAVGFILLAKLRELHHHELLRKHYVRTYYLLQLFAAAGYLWTGAAK 
200 210 220 230 240 250 

100 110 120 130 140 150 

orf 130. pep LQNLPASAPLHLITLGGMMGGVMMVWLTAGLWHSGFTKLDYPKLCRIAVPILFAAAVSRA 

I I I I I I I I I II I I I I I I I I >: I Ml I I MM I I I I I M I I I I I I I M I I I M I I I I I 
orf 130a LQNLPASAPLHLITLGGMMGSVMMVWLTAGLWHSGFTKLDYPKLCRIAVPILFAAAVSRA 
260 270 280 290 300 310 

160 170 180 190 

orf 130 . pep FLXNVNPXFFI TVPAI LTAAVFVLYLFXFI P I FRANAFTDDPEX 

I 1 1 1 1 I M I M 1 1 M 1 1 1 1 1 M M M M 1 1 1 1 1 1 li 

or f 1 3 0 a VLMNVNP I FF I TVPAI LTAAVFVLYLLTFVP I FRANAFTDDPEX 

320 330 340 350 
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complete length ORFBOa nucleotide sequence [<SEQ ID 849>] (SEP ID NO: 849) is: 



1 ATGCGGCCGT TTTTCGTCGG CGCGGCGGTG CTTGCCATAC TCGGTGCGCT 

51 GGTGTTTTTC ATCAACCCCG GTGCCATCGT CCTGCACCGC CAAATTTTCT 

101 TGGAACTTAT GCTGCCGGCG GCATACGGCG GTTTTTTGAC TGCGGCTTTG 

151 TTGGACTGGA CGGGTTTTTC GGGTAACCTG AAACCTGTCG CGACTTTGAT 

201 GGCGGCATTA TTGCTCGCCG CATCCGCTAT ACTGCCCTTT TCGCCGCAAA 

251 CTGCCTCGTT TTTCGTCGCC GCCTATTGGC TGGTGTTGCT GCTGTTCTGC 

3 01 GCCCGGCTGA TTTGGCTAGA CCGAAACACC GACAACTTCG CCCTGCTAAT 
351 GTTACTTGCC GCGTTCACTG TTTTTCAGAC GGCATATGCC GTCAGCGGCG 

4 01 ATTTGAACCT GTTGCGCGCG CAAGTGCATC TAAATATGGC GGCGGTGATG 
4 51 TTCGTATCCG TGCGCGTCAG TATTCTTTTG GGCGCGGAAG CCCTGAAAGA 
501 ATGCCGTCTG AAAGACCCAG TATTCATCCC CAATGTCGTC TATAAAAACA 
551 TCGCCATTAC CTTCCTGCTC CTGCACGCCG CCGCCGAACT TTGGCTGCCT 
601 GCGCAAACCG CCGGTTTTAC CTCGCTCGCC GTCGGCTTTA TCCTGCTTGC 
6 51 CAAGCTGCGT GAGCTTCACC ATCACGAACT CCTGCGCAAA CACTACGTCC 
701 GCACTTATTA CCTGCTCCAA CTCTTTGCCG CCGCAGGCTA TTTGTGGACA 
751 GGCGCGGCGA AATTACAAAA CCTGCCCGCC TCCGCGCCCC TGCACCTGAT 
801 TACCCTCGGT GGCATGATGG GCAGCGTGAT GATGGTGTGG CTGACTGCCG 
851 GACTGTGGCA CAGCGGCTTT ACCAAGCTCG ACTACCCGAA ACTCTGCCGC 
901 ATCGCCGTCC CCATCCTNTT CGCCGCCGCC GTTTCGCGCG CTGTTTTAAT 
951 GAACGTAAAC CCGATATTCT TCATCACCGT CCCCGCAATT CTGACCGCCG 

1001 CCGTGTTCGT GCTTTACCTG CTGACATTCG TACCGATCTT TCGGGCGAAC 

1051 GCGTTTACAG ACGATCCGGA ATAA 

encodes a protein having amino acid sequence [<SEQ ID 850>] (SEP ID NO: 850) : 

1 MRPFFVGAAV LAILGALVFF INPGAIVLHR QIFLELMLPA AYGGFLTAAL 

51 LDWTGFSGNL KPVATLMAAL LLAASAILPF SPQTASFFVA AYWLVLLLFC 

101 ARLIWLDRNT DNFA LLMLLA AFTVFQTAYA V SGDLNLLRA QVHLN MAAVM 

151 FVSVRVSILL GA EALKECRL KDPVFIPNW YKN IAITFLL LHAAAELWLP 

2 01 AQ TAGFTSLA VGFILLAKL R ELHHHELLRK HYVRTYYLLQ LFAAAGYLWT 

2 51 GAAKLQNLPA SAPLH LITLG GMMGSVMMVW LTA GLWHSGF TKLDYPKLCR 

301 IAVPILFAAA VSRAVLMNVN PIFFITVPAI LTAAVFVLYL LTFVPIFRAN 

351 AFTDDPE* 



30a (SEP ID NP: 850) and PRF130-1 (SEP ID NP: 848) show 98.3% identity in 357 aa 



overlap: 



orf 130a .pep 



MRPFFVGAAVLAILGALVFFINPGAIVLHRQIFLELMLPAAYGGFLTAALLDWTGFSGNL 



orf 130-1 




orf 130a .pep 



KPVATLMAALLLAASAILPFSPQTASFFVAAYWLVLLLFCARLIWLDRNTDNFALLMLLA 



orf 130-1 




orf 130a .pep 



AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVSILLGAEALKECRLKDPVFIPNW 



orf 130-1 




orf 130a. pep 
orf 130-1 



YKNIAITFLLLHAAAELWLPAQTAGFTSLAVGFILLAKLRELHHHELLRKHYVRTYYLLQ 

I I I I I II I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
YKNIAITFLLLHAAAELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQ 



CHIR-0160 (356.001) 



-601- 



PATENT 



orf 13 0a . pep LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMMGSVMMWLTAGLWHSGFTKXDYPKLCR 

III IIIIIIIIIMIIIIIIIIM INI hllllllllMIMIilMI Mill 

orf 13 0 - 1 LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMMGGVMIWWLTAGLWHSGFTKLDYPKLCR 

orf 130a. pep I AVP ILFAAAVSRAVLMNVNP I FF ITVPAI LTAAVFVLYLLTFVPI FRANAFTDDPE 

I I I I I : I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I h I I : I I I I I I I I I I I I 
orf 130-1 IAVPILFAAAVSRAFLMNVNP IFF ITVPAI LTAAVFVLYLFTFI PI FRANAFTDDPE 

Homology with a predicted ORF from N '.gonorrhoeae 

ORF130 (SEP ID NO: 846) shows 91.7% identity over a 193 aa overlap with a predicted ORF 
(ORF130ng) (SEP ID NO: 852) from N. gonorrhoeae: 

orf 130 .pep LKECRLKDPVFIPNIVYKNIAITFLLLHAA 30 

MM lllllll|::|IMIII MMM 
orfl30ng LNLLRAQVHLNMAAVMFVSVRVSVLLGTETLKECRLKDPVFIPNVIYKNIAIT-LLLHAA 201 

orf 130. pep AELWLPAQTAGFTALAVGF I LLAKLRELHHHELLRKHYVRTYYLLQLFAAAGSLWTGAAX 90 

MMMMMMMMMMMMMMMMMMMMMMMMMM MMM 

or f 1 3 0 ng AELWLPAQTAGFTALAVGF I LLAKLRELHHHELLRKH YVRTYYLLQLFAAAGYLWTGAAK 261 

orf 130 .pep LQNLPASAPLHLITLGGMMGGVMMVWLTAGLWHSGFTKLDYPKLCRIAVPILFAAAVSRA 150 

MMMMMMMMM M M M M M M M M M M M M M M M 1 1 1 1 : M 1 1 1 

orf 130ng LQNLPASAPLHLITLGGMTGGVMMVWLTAGLWHSGFTKLDYPKLCRIAVSILFASAVSRA 321 

orf 130 .pep FLXNVNPXFFI TVPAI LTAAVFVLYLFXFI P I FRANAFTDDPE 193 

I MM MMM 1 1 1 1 M h 1 1 M : h 1 1 1 II 1 1 1 1 1 1 1 1 

orf 13 0ng VLMNVNP I FFI TVPE I LTAAVFMLYLLTFVP I FRANAFTDDPE 3 64 

An PRF130ng nucleotide sequence [<SEQ ID 85 1>] (SEP ID NP: 851) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 852>] (SEP ID NP: 852) : 

1 MNKFFTHPMR PFFVGAA VLA ILGALVFFHQ PRRYHPAPPN FLGTYAAGC I 

51 RRFFDYRFVG PDGFFRQPET CRYFDG GWA CCGCFIAVFT ATC RIFRRRL 

101 LAGVAAVLRL ADLARRQHRT LRSVDVTAAF TVFQTAYAVS GDLNLLRAQV 

151 HLNMAAVMFV SVRVSVLLGT ETLKECRLKD PVFIPNVIYK NIAITLLLHA 

201 AAELWLPAQ T AGFTALAVGF ILLAKL RELH HHELLRKHYV RTYYLLQLFA 

251 AAGYLWTGAA KLQNLPASAP LHLITLGGMT GGVMMVWLTA GLWHSGFTKL 

301 DYPKLCRIAV SILFASAVSR AVLMNVNPIF FITVPEILTA AVFMLYLLTF 

3 51 VP I FRANAFT DDPE* 

Further work revealed the following gonococcal DNA sequence [<SEQ ID 853>] (SEP ID NP: 
853) : 



1 ATGCGCCCGT TTTTCGTCGG TGCGGCAGTA CTTGCCATAC TCGGTGCGTT 

51 GGTGTTTTTT ATCAACCCCG GCGCTATCAT CCTGCACCGC CAAATTTTCT 

101 TGGAACTTAT GCTGCCGGCT GCATACGGCG GTTTTTTGAC TACCGCTTTG 

151 TTGGACCGGA CGGGTTTTTC AGGCAACCTG AAACCTGCCG CTACTTTGAT 

2 01 GGCGGTGTTG TTGCTTGTTG CGGCTGTTTT ATTGCCGTTT TTACCGCAAC 
251 TTGCCGCATT TTTCGTCGCC GCCTATTGGC TGGTGTTGCT GCTGTTCTGC 
301 GCCTGGCTGA TTTGGCTCGA CCGCAACACC GACAACTTCG CTCTGTTGAT 

3 51 GTTACTTGCC GCATTTACCG TTTTTCAGAC GGCCTATGCC GTCAGCGGCG 
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4 01 ATTTGAACTT ACTGCGCGCG CAAGTGCATT TGAATATGGC GGCGGTCATG 

451 TTCGTATCCG TCCGCGTCAG CGTCCTTTTG GGCACGGAAA CCCTGAAAGA 

501 ATGCCGTCTG AAAGACCCCG TATTCATCCC CAACGTTATC TATAAAAACA 

551 TCGCCATCAC CCTGCTGCTG CACGCCGCCG CCGAACTTTG GCTGCCCGCG 

601 CAAACCGCCG GTTTTACTGC GCTTGCCGTC GGCTTCATCC TGCTCGCCAA 

651 GCTGCGCGAA CTGCACCATC ACGAACTCTT ACGCAAACAC TACGTCCGCA 

701 CTTATTACCT GCTCCAGCTC TTTGCCGCCG CAGGTTATCT GTGGACAGGC 

751 GCGGCGAAAC TGCAAAACCT GCCCGCCTCC GCGCCCCTGC ACCTGATTAC 

801 CCTCGGCGGC ATGACGGGTG GCGTGATGAT GGTGTGGCTG ACTGCCGGAC 

851 TGTGGCACAG CGGCTTTACC AAACTCGACT ACCCGAAACT CTGCCGCATC 

901 GCCGTCTCCA TCCTTTTCGC CTCCGCCGTT TCGCGCGCTG TTTTAATGAA 

951 CGTGAATCCG ATATTCTTCA TCACCGTTCC CGAGATTCTG ACCGCCGCCG 

1001 TGTTCATGCT TTACCTGCTG ACGTTCGTAC CGATTTTTCG AGCGAACGCG 

1051 TTTACAGACG ATCCGGAATA A 



This corresponds to the amino acid sequence [<SEQ ID 854; ORF130ng-l>] (SEP ID NO: 854; 
ORF130ng-n : 



1 MRPF FVGAAV LAILGALVFF I NPGAIILHR QIFLELMLPA AYGGFLTTAL 

51 LDRTGFSGNL KPAATLMAVL LLVAAVLLPF LPQLAAFFVA AYWLVLLLFC 

101 AWLIWLDRNT DNF ALLMLLA AFTVFQTAYA V SGDLNLLRA QVHLNMAAVM 

151 FVSVRVSVLL GTETLKECRL KDP VFIPNVI YKNIAITLLL HAAAELWLPA 

201 Q TAGFTALAV GFILLAKL RE LHHHELLRKH YVRTYYLLQL FAAAGYLWTG 

251 AAKLQNLPAS APLHLITLGG MTGGVMMVWL TAGLWHSGFT KLDYPKLCRI 

301 AVSILFASAV SRAVLMNVNP IFFITVPEIL TAAVFMLYLL TFVPIFRANA 

351 FTDDPE* 



ORF130ng-l (SEP ID NO: 854) and ORF130-1 (SEP ID NO: 848) show 92.4% identity in 357 aa 
overlap: 



orf 13 0 - 1 . pep MRPFFVGAAVLAILGALVFFINPGAI VLHRQIFLELMLPAAYGGFLTAALLDWTGFSGNL 

I I I I I I I I I I I I I I I I I I I I I I I I I hi I M II I I I I I I M I I i I i ^ I I M IIIMII 
or f 1 3 0 ng - 1 MRPFFVGAAVLAI LGALVFF INPGAI I LHRQ I FLELMLPAAYGGFLTTALLDRTGFSGNL 

orf 13 0 - 1 . pep KPVATLMAALLLAASAILPFSPQTASFFVAAYWLVLLLFCARLIWLDRNTDNFALLMLLA 

I : I I I I I: I I hi - : I I || I : I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I 
orf 13 Ong- 1 KPAATLI^VLLLVAAVLLPFLPQLAAFFVAAYWLVLLLFCAWLIWLDRNTDNFALLMLLA 

or f 1 3 0 - 1 . pep AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVS I LLGAEALKECRLKDPVF I PN I V 

! I II 1 1 1 II II 1 1 1 II 1 1 II I II 1 1 1 II II I lh II h hh h h I II II I! 1 1 h - 

orf 13 Ong- 1 AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVSVLLGTETLKECRLKDPVFIPNVI 



orf 130-1. pep YKN I AI TFLLLHAAAELWLP AQTAGFTALAVGF I LLAKLRELHHHELLRKHYVRT YYLLQ 

MIIIM I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
orf 13 Ong- 1 YKN IAIT-LLLHAAAELWLPAQTAGFTALAVGF I LLAKLRELHHHELLRKHYVRT YYLLQ 

r 

orf 13 0-1. pep LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMMGGVMMWLTAGLWHSGFTKLDYPKLCR 

1 1 II II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M 

orf 13 Ong- 1 LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMTGGVMMVWLTAGLWHSGFTKLDYPKLCR 



orf 130-1 .pep IAVPILFAAAVSRAFLMNVNPIFFITVPAILTAAVFVLYLFTFIPIFRANAFTDDPEX 

III lllhlllll IIIIIIIIIIIM I ! I I I I I I I i : I 1 I ) I I I I I I I I I I 1 I 

orfl30ng-l I AVS I LFAS AVSRAVLMNVNP I FF I TVPE I LTAAVFML YLLTFVP I FRANAFTDDPEX 
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Based on this analysis, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and 
their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 101 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 855>] (SEP ED 
5 NO: 855) : 

1 ATGGAAATTC GGGCAATAAA ATATACGGCA ATGGCTGCGT TGCTTGCATT 
51 TACGGTTGCA GGCTGCCGGC TGGCGGGGTG GTATGAGTGT TCGTCCCTCA 
101 CCGGCTGGTG TAAGCCGAGA AAACCGGCTG CCATCGATTT TTGGGATATT 
151 GGCGGCGAGA GTCCGCCGTC TTTAGGGGAC TACGAGATAC CGCTTTCAGA 
10 2 01 CGGCAATAGT TCCGTCAGGG CAAACGAATA TGAATCCGCA CAACAATCTT 

251 ACTTTTACAG GAAAATAGGG AAGTTTGAAG C . TGCGGGCT GGATTGGCGT 
301 ACGCGTGACG GCAAACCTTT GATTGAGACG TTCAAACAGG GAGGATTTGA 
351 CTGCTTGGAA AAG . . 

15 This corresponds to the amino acid sequence [<SEQ ID 856; ORF131>] (SEP ID NO: 856; 
ORF131) : 

1 MEIRAIKYTA MAALLAFTVA GCRLAGWYEC SSLTGWCKPR KPAAIDFWDI 
51 GGESPPSLGD YEIPLSDGNS SVRANEYESA QQSYFYRKIG KFEXCGLDWR 
101 TRDGKPLIET FKQGGFDCLE K. . 

20 

Further work revealed the complete nucleotide sequence [<SEQ ID 857>] (SEP ID NO: 857) : 

1 ATGGAAATTC GGGCAATAAA ATATACGGCA ATGGCTGCGT TGCTTGCATT 

51 TACGGTTGCA GGCTGCCGGC TGGCGGGGTG GTATGAGTGT TCGTCCCTCA 

101 ^ CCGGCTGGTG TAAGCCGAGA AAACCGGCTG CCATCGATTT TTGGGATATT 

25 151 GGCGGCGAGA GTCCGCCGTC TTTAGGGGAC TACGAGATAC CGCTTTCAGA 

201 CGGCAATCGT TCCGTCAGGG CAAACGAATA TGAATCCGCA CAACAATCTT 

251 ACTTTTACAG GAAAATAGGG AAGTTTGAAG CCTGCGGGCT GGATTGGCGT 

3 01 ACGCGTGACG GCAAACCTTT GATTGAGACG TTCAAACAGG GAGGATTTGA 

351 CTGCTTGGAA AAGCAGGGGT TGCGGCGCAA CGGTCTGTCC GAGCGCGTCC 

30 4 01 GATGGTAA 

This corresponds to the amino acid sequence [<SEQ ID 858; PRF131-1>] (SEP ID NO: 858; 
ORF131-1) : 



1 MEIRAIKYTA MAALLAFTVA G CRLAGWYEC SSLTGWCKPR KPAAIDFWDI 
35 51 GGESPPSLGD YEIPLSDGNR SVRANEYESA QQSYFYRKIG KFEACGLDWR 

101 TRDGKPLIET FKQGGFDCLE KQGLRRNGLS ERVRW* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with a predicted PRF from N.meningitidis (strain A) 
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ORF131 (SEP ID NO: 856) shows 95.0% identity over a 121aa overlap with an ORF (ORF131a) 
(SEP ID NO: 860) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 13 1 . pep MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGESPPSLGD 

5 I I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I • I I I II I II I I I I I I I I I I I I I I I I I 

orf 131a MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLSGWCKPRKPAAIDFWDIGGESPPSLED 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 131. pep YEIPLSDGNSSVRANEYESAQQSYFYRKIGKFEXCGLDWRTRDGKPLIETFKQGGFDCLE 

10 Ml II II III II MM II MM II li II I II II MM II II MM II MMM 

orf 131a YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQEGFDCLK 

70 80 90 100 110 120 



orf 131. pep K 
15 | 

orf 13 la KQGLRRNGLS ERVRWX 

130 



The complete length PRF131a nucleotide sequence [<SEQ ID 859>] (SEP ID NP: 859) is: 

20 1 ATGGAAATTC GGGCAATAAA ATATACGGCA ATGGCTGCGT TGCTTGCATT 

51 TACGGTTGCA GGCTGCCGGT TGGCAGGTTG GTATGAGTGT TCGTCCCTGT 

101 CCGGCTGGTG TAAGCCGAGA AAACCTGCCG CCATCGATTT TTGGGATATT 

151 GGCGGCGAGA GTCCTCCGTC TTTAGAGGAC TACGAGATAC CGCTTTCAGA 

201 CGGCAATCGT TCCGTCAGGG CAAACGAATA TGAATCCGCA CAACAATCTT 

25 251 ACTTTTACAG GAAAATAGGG AAGTTTGAAG CCTGCGGGTT GGATTGGCGT 

301 ACGCGTGACG GCAAACCTTT GATTGAGACG TTCAAACAGG AAGGTTTTGA 

351 TTGTTTGAAA AAGCAGGGGT TGCGGCGCAA CGGTCTGTCC GAGCGCGTCC 

4 01 GATGGTAA 



30 This encodes a protein having amino acid sequence [<SEQ ID 860>] (SEP ID NP: 860) : 



1 MEIRAIKYTA MAALLAFTVA G CRLAGWYEC SSLSGWCKPR KPAAIDFWDI 
51 GGESPPSLED YEIPLSDGNR SVRANEYESA QQSYFYRKIG KFEACGLDWR 
101 TRDGKPLIET FKQEGFDCLK KQGLRRNGLS ERVRW* 

35 PRF131a (SEP ID NP: 860) and PRF131-1 (SEP ID NP: 858) show 97.0% identity in 135 aa 
overlap: 



orf 13 la. pep ME I RAIKYTAMAALLAFTVAGCRLAGWYECSSLSGWCKPRKPAAIDFWD I GGESPPSLED 

M II 1 1 1 1 Ml 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 hll I II I 1 1 1 II 1 1 M Ml 1 1 1 1 1 I 

orf 131-1 MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGESPPSLGD 

40 orf 13 la . pep YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQEGFDCLK 

IMMMMIMMIMM IIIIIIIIIIIIIIIMIIIM IMMIIMI Mill: 
orf 131-1 YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQGGFDCLE 

orf 131a .pep KQGLRRNGLS ERVRWX 

I I I I I I I I I I I I I I I 
45 orf 131-1 KQGLRRNGLSERVRWX 
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Homology with a predicted ORF from N. gonorrhoeae 

ORF131 (SEP ID NO: 856) shows 89.3% identity over 121 aa overlap with a predicted ORF 
(ORF131ng) (SEP ID NO: 862) from N. gonorrhoeae: 

orf 131 .pep MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGESPPSLGD 60 

Illhlllll llhllllllllllMIII IhllllllllllllllMIIIII II I 
orf 131ng MEIRVIKYTATAALFAFTVAGCRLAGWYECLSLSGWCKPRKPAAIDFWDIGGESPLSLED 60 

or f 1 3 1 . pep YE I PLSDGNS S VRANEYESAQQS YFYRKIGKFEXCGLDWRTRDGKPL I ETFKQGGFDCLE 120 

Mill II II MMIIIMI hlllllll MM I MM III I IM Ml MUM 

orfl31ng YE I PLSDGNRS VRANE YES AQKS YFYRKI GKFEACGLDWRTRDGKPLVERFKQEGFDCLE 120 

orf 131. pep K 121 
I 

orf 131ng KQGLRRNGLSERVRW 134 

A complete length PRFBlng nucleotide sequence [<SEQ ID 86 1>] (SEP ID NP: 861) was 
predicted to encode a protein having amino acid sequence [<SEQ ID 862>] (SEP ID NP: 862) : 

1 MEIRVIKYTA TAALFAFTVA GC RLAGWYEC LSLSGWCKPR KPAAIDFWDI 
51 GGESPLSLED YEIPLSDGNR SVRANEYESA Q KS YFYRKI G KFEACGLDWR 
101 TRDGKPLVER FKQEGFDCLE KQGLRRNGLS ERVRW* 

Further work revealed the following gonococcal DNA sequence [<SEQ ID 863>] (SEP ID NP: 
863}: 

1 ATGGAAATTC GGGTAATAAA ATATACGGCA ACGGCTGCGT TGTTTGCATT 

51 TACGGTTGCA GGCTGCCGGC TGGCGGGGTG GTATGAGTGT TCGTCCTTGT 

101 CCGGCTGGTG TAAGCCGAGA AAACCTGCCG CCATCGATTT TTGGGATATT 

151 GGCGGCGAGA GtCcgctGTC TTTAGAGGAC TACGAGATAC CGCTTTCAGA 

201 CGGCAATCGT TCCGTCAGGG CAAACGAATA TGAATCCGCG CAAAAATCTT 

251 ACTTTTATAG GAAAATAGGG AAGTTTGAAG CCTGCGGGTT GGATTGGCGT 

301 ACGCGTGACG GCAAACCTTT GGTTGAGAGG TTCAAACAGG AAGGTTTCGA 

351 CTGTTTGGAA AAGCAGGGGT TGCGGCGCAA CGGCCTGTCC GAGCGCGTCC 

401 GATGGTAA 

This corresponds to the amino acid sequence [<SEQ ID 864; PRF131ng-l>] (SEP ID NP: 864; 
PRF131ng-l): 



1 MEIRVIKYTA TAALFAFTVA G CRLAGWYEC SSLSGWCKPR KPAAIDFWDI 
51 GGESPLSLED YEIPLSDGNR SVRANEYESA QKSYFYRKIG KFEACGLDWR 
101 TRDGKPLVER FKQEGFDCLE KQGLRRNGLS ERVRW* 

PRF131ng-l (SEP ID NP: 864) and PRF131-1 (SEP ID NP: 858) show 92.6% identity in 135 aa 
overlap: 



orf 131ng-l .pep ME I RVIKYTATAALFAFTVAGCRLAGWYECSSLSGWCKPRKPAAIDFWD I GGESPLSLED 
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I I I I * I I I I I I H * I I I I I I I I I I I I I I I I I I • I I I I I I I 1 I I I I I I I I 1 I I I I II I 

orf 131-1 ME IRAI KYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGES PPSLGD 

orf 131ng-l .pep YE I PLSDGNRSVRANEYESAQKS YFYRKIGKFEACGLDWRTRDGKPLVERFKQEGFDCLE 

I I I I I I I i I I I M I , I I I MM I M II I I I I I I I I I I I I I I I I I h I III llllll 
or f 1 3 1 - 1 YE I PLSDGNRS VRANE YESAQQS YF YRKIGKFEACGLDWRTRDGKPLI ETFKQGGFDCLE 

orf 131ng-l .pep KQGLRRNGLSERVRWX 

IIIIIIMIIIIIIII 
orf 131-1 KQGLRRNGLSERVRWX 

Based on the presence of a predicted prokaryotic membrane lipoprotein lipid attachment site, it is 
predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be 
useful antigens for vaccines or diagnostics, or for raising antibodies. 



Example 102 



The following partial DNA sequence was identified in ^meningitidis [<SEQ ID 865>] (SEP ID 
NO: 865) 

TCCATATTAT CGGTATCGGC GGCACGTTTA TGGGCGGGCT 
GCCAAAGAAG CGGGGTTTGA AGTCAGCGGT TGCGACGCGA 
GCCGATGAGC ACCCAGCTCG AAGCCTTGGG TATAGACGTG 
TCGATGCCGC TCAGTTGGAC GAATTTAAAG CCGACGTTTA 
AATGTCGCCA AGCGCGGGAT GGATGTGGTT GAAGCGATTT 
CCTGCCtTAT ATtTcCGGCC CGCAATGGCT GTCGGAAAAC 
ATCATTGGGT ACTCGGTGTG GCGGGGACgC ACGGCAAAAC 
TCCATGCTCG CATGGGTCTT GGAATATgCC GGCCTCGCGC 
TATtGGCGGC GTACC . GGAA AATttCGGCG TTTCCGCCCG 
ACGCCGCGCC AAGACCCGAA CAGCCAATCG CCGTTTTTcG 
CGACGAATAC GACACCGCCT TTtTCGACAA ACGTTCTAAA 
ACCGTCCGCG TACCGCCGTG TTGAACAATC TGGAATTCGA 
ATCTTTGCCG ACTTGGGCGC GATACAGACc CAGTTCCACT 
TACCGTGCCG TCTGAAGGCT TAATCGTCTG CAACGGACGG 
TGCAAGATAC TTTGGACAAA GGCTGCTGGA CGCCGGTGGA 
ACGGAACACG GCTGGCA . . 

This corresponds to the amino acid sequence [<SEQ ID 866; ORF132>] (SEP ID NO: 866; 
PRF132) : 



1 ATGAAACACA 

51 TGCCGCCATT 

101 AGATGTATCC 

151 TATGAAGGCT 

2 01 CGTTATCGGC 

2 51 TGAACCTCGG 

3 01 GTGCTGCACC 
351 GACCACCGCC 

4 01 CGGGCTTCCT 
4 51 CCTGCCGCAA 
501 TCATCGAAGC 
551 TtCGTGCATT 
601 CCACGCCGAC 
651 ACCTCGTGCG 
701 CAGCAAAGCC 
751 AAAATTCGGC 



1 MKHIHIIGIG GTFMGGLAAI AKEAGFEVSG CDAKMYPPMS TQLEALGIDV 

51 YEGFDAAQLD EFKADVYVIG NVAKRGMDW EAILNLGLPY ISGPQWLSEN 

101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VXGKFRRFRP 

151 PAANAAPRPE QPIAVFRHRS RRIRHRLFRQ TFXIRALPSA YRRVEQSGIR 

201 PRRHLCRLGR DTDPVPLPRA YRAVXRLNRL QRTAAKPARY FGQRLLDAGG 

251 KIRHGTRLA.. 

Further work revealed the complete nucleotide sequence [<SEQ ID 867>] (SEP ID NP: 867) : 



1 ATGAAACACA TCCATATTAT CGGTATCGGC GGCACGTTTA TGGGCGGGCT 
51 TGCCGCCATT GCCAAAGAAG CGGGGTTTGA AGTCAGCGGT TGCGACGCGA 
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101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG TATAGACGTG 

151 TATGAAGGCT TCGATGCCGC TCAGTTGGAC GAATTTAAAG CCGACGTTTA 

201 CGTTATCGGC AATGTCGCCA AGCGCGGGAT GGATGTGGTT GAAGCGATTT 

251 TGAACCTCGG CCTGCCTTAT ATTTCCGGCC CGCAATGGCT GTCGGAAAAC 

301 GTGCTGCACC ATCATTGGGT ACTCGGTGTG GCGGGGACGC ACGGCAAAAC 

351 GACCACCGCC TCCATGCTCG CATGGGTCTT GGAATATGCC GGCCTCGCGC 

401 CGGGCTTCCT TATTGGCGGC GTACCGGAAA ATTTCGGCGT TTCCGCCCGC 

4 51 CTGCCGCAAA CGCCGCGCCA AGACCCGAAC AGCCAATCGC CGTTTTTCGT 

501 CATCGAAGCC GACGAATACG ACACCGCCTT TTTCGACAAA CGTTCTAAAT 

551 TCGTGCATTA CCGTCCGCGT ACCGCCGTGT TGAACAATCT GGAATTCGAC 

601 CACGCCGACA TCTTTGCCGA CTTGGGCGCG ATACAGACCC AGTTCCACTA 

651 CCTCGTGCGT ACCGTGCCGT CTGAAGGCTT AATCGTCTGC AACGGACGGC 

701 AGCAAAGCCT GCAAGATACT TTGGACAAAG GCTGCTGGAC GCCGGTGGAA 

751 AAATTCGGCA CGGAACACGG CTGGCAGGCC GGCGAAGCCA ATGCCGACGG . 

801 CTCGTTCGAC GTGTTGCTCG ACGGCAAAAC CGCCGGACGC GTCAAATGGG 

851 ATTTGATGGG CAGGCACAAC CGCATGAACG CGCTCGCCGT CATTGCCGCC 

901 GCGCGTCATG TCGGTGTCGA TATTCAGACC GCCTGCGAAG CCTTGGGCGC 

951 GTTTAAAAAC GTCAAACGCC GGATGGAAAT CAAAGGCACG GCAAACGGCA 

1001 TCACCGTTTA CGACGACTTC GCCCACCACC CGACCGCCAT CGAAACCACG 

1051 ATTCAAGGTT TGCGCCAACG CGTCGGCGGC GCGCGCATCC TCGCCGTCCT 

1101 CGAACCGCGT TCCAACACGA TGAAGCTGGG CACGATGAAG TCCGCCCTGC 

1151 CTGTAAGCCT CAAAGAAGCC GACCAAGTGT TCTGCTACGC CGGCGGCGTG 

12 01 GACTGGGACG TCGCCGAAGC CCTCGCGCCT TTGGGCGGCA GGCTGAACGT 

1251 CGGCAAAGAC TTCGATGCCT TCGTTGCCGA AATCGTGAAA AACGCCGAAG 

1301 TAGGCGACCA TATTTTGGTG ATGAGCAACG GCGGTTTCGG CGGAATACAC 

1351 GGAAAGCTGC TGGAAGCTTT GAGATAG 

This corresponds to the amino acid sequence [<SEQ ID 868; ORF132-l>] (SEP ID NO: 868; 
PRF132-1) : 



1 MKHIHIIGIG GTFNGGLAAI A KEAGFEVSG CDAKMYPPMS TQLEALGIDV 

51 YEGFDAAQLD EFKADVYVIG NVAKRGMDW EAILNLGLPY ISGPQWLSEN 

101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VPENFGVSAR 

151 LPQTPRQDPN SQSPFFVIEA DEYDTAFFDK RSKFVHYRPR TAVLNNLEFD 

"2 01 HADIFADLGA IQTQFHYLVR TVPSEGLIVC NGRQQSLQDT LDKGCWTPVE 

2 51 KFGTEHGWQA GEANADGSFD VLLDGKTAGR VKWDLMGRHN RMNALAVIAA 
301 ARHVGVDIQT ACEALGAFKN VKRRMEIKGT ANGITVYDDF AHHPTAIETT 

3 51 IQGLRQRVGG ARILAVLEPR SNTMKLGTMK SALPVSLKEA DQVFCYAGGV 
401 DWDVAEALAP LGGRLNVGKD FDAFVAEIVK NAEVGDHILV MSNGGFGGIH 

4 51 GKLLEALR* 

Computer analysis of this amino acid sequence gave the following results: 



Homology with the hypothetical o457 protein fSEO ID NO: 1166) of E.coli (accession number 
U 14003) 

ORF132 (SEP ID NO: 866) and o457 (SEP ID NO: 1166) show 58% aa identity in 140 aa overlap: 



Orfl32: 4 
0457: 3 



IHI IGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLDEFK 63 
IHI+GI GTFMGGLA +A+ + G EV+G DA +YPPMST LE GI++ +G+DA+QL+ + 
IHI LGI CGTFMGGLAMLARQLGHEVTGSDANVYP PMSTLLEKQGI EL IQGYDASQLEP - Q 61 
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Orfl32: 64 ADVYVIGNVAKRGMDWEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTASML 123 

D+ +IGN RG VEA+L +PY+SGPQWL + VL WVL VAGTHGKTTTA M 
0457: 62 PDLVI IGNAMTRGNPCVEAVLEKNI PYMSGPQWLHDFVLRDRWVLAVAGTHGKTTTAGMA 121 

Orfl32: 124 AWVLEYAGLAPGFLIGGVXG 143 

W+LE G PGF+IGGV G 
0457: 122 TW I LEQCGYKPGFVI GGVPG 141 

Homology with a predicted ORF from N. meningitidis (strain A) 

ORF132 (SEP ID NO: 866) shows 74.6% identity over a 189aa overlap with an ORF (ORF132a) 
(SEP ID NO: 870) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 132 .pep MKH I H I I G I GGTFMGGLAAI AKEAGFEVSGCDAKMYPPMSTQLEALG I DVYEGFDAAQLD 

Illllllllllllllhllllllllll IIIIIIIIIIIIIIIIMII lllllhllM 
orf 132a MKHIHIIGIGGTFMGGIAAIAKEAGFEXSGCDAKMYPPMSTQLEALGIGVYEGFDTAQLD 

10 20 30 40 50 60 

70 80 90 100 110 120 

orf 132 . pep EFKADVYVIGNVAKRGMDWEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II Ml Mlhll Mill MM I II 1 1 II I 

orf 132a EFKADVYV I GNVAKRGMDWEAI LNRGLP Y I SGPQWLAENXLHHHWXLGVAXTHGKTTTA 

70 80 90 100 110 120 

130 140 150 160 
orfl32.pep SMLAWVLE YAGLAPG FL I GGVXGKFR RFRP P AANAAPRP EQP I AVFR 

I I I I I I I M Mill I MM M h I - I :M: I I 

O r f 1 3 2 a SMLAWVLEYAGLAPGFX I GGVP ENFS VS ARL - PQTPRQDPNSQS PFFVI EADEYDTAFFD 

130 140 150 160 170 

170 180 190 200 210 220 

orf 132 . pep HRSRRIRHRLFRQTFXIRALPSAYRRVEQSGIRPRRHLCRLGRDTDPVPLPRAYRAVXRL 
:||: :::| 

orf 132a KRSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHHLVRTVPSEGLIVCNGRQQSLQD 
180 190 200 210 220 230 



The complete length PRF132a nucleotide sequence [<SEQ ID 869>] (SEP ID NP: 869) is: 



1 ATGAAACACA TCCACATTAT CGGTATCGGC GGCACGTTTA TGGGTGGGAT 

51 TGCCGCCATT GCCAAAGAAG CAGGGTTTGA ANTCAGCGGT TGCGATGCGA 

101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG CATAGGCGTG 

151 TATGAAGGCT TCGACACCGC GCAGTTGGAC GAATTTAAAG CCGACGTTTA 

2 01 CGTTATCGGC AATGTCGCCA AGCGCGGGAT GGATGTGGTT GAAGCGATTT 

2 51 TGAACCGTGG GCTGCCTTAT ATTTCCGGCC CGCAATGGCT GGCTGAAAAC 

301 NTGCTGCACC ATCATTGGNN ACTCGGCGTG GCGGNGACGC ACGGCAAAAC 

351 GACCACCGCG TCTATGCTCG CGTGGGTTTT GGAATATGCC GGACTCGCAC 

4 01 CGGGCTTCNT TATCGGCGGC GTACCGGAAA ACTTCAGCGT TTCCGCCCGC 

4 51 CTGCCGCAAA CGCCGCGCCA AGACCCGAAC AGCCAATCGC CGTTTTTCGT 

501 CATTGAAGCC GACGAATACG ACACCGCGTT TTTCGACAAA CGCTCCAAAT 

551 TCGTGCATTA CCGTCCGCGT ACCGCCGTGT TGAACAATCT GGAATTCGAC 

601 CACGCCGACA TCTTCGCCGA TTTGGGCGCG ATACAGACCC AGTTCCACCA 

651 CCTCGTGCGT ACCGTGCCGT CTGAAGGCCT CATCGTCTGC AACGGACGGC 

701 AGCAAAGCCT GCAAGACACT TTGGACAAAG GCTGCTGGAC GCCGGTGGAA 

751 AAATTCGGCA CGGAACACGG CTGGCAGGCC GGCGAAGCCA ATGCCGATGG 

801 CTCGTTCGAC GTGTTGCTTG ACGGCAAAAA AGCCGGACAC GTCGCTTGGA 
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851 GTTTGATGGG CGGACACAAC CGCATGAACG CGCTCGCNGT CATCGCCGCC 

901 GCGCGTCATG CCGGAGTNGA CATTCAGACG GCCTGCGAAG CCTTGAGCAC 

951 GTTTAAAAAC GTCAAACGCC GCATGGAAAT CAAAGGCACG GCAAACGGTA 

1001 TCACCGTTTA CGACGACTTC GCCCACCATC CGACCGCTAT CGAAACCACG 

5 1051 ATTCAAGGTT TGCGCCAGCG CGTCGGCGGC GCGCGCATCC TCGCCGTCCT 

1101 CGAACCGCGT TCCAATACGA TGAAGCTGGG TACGATGAAA GCCGCCCTGC . 

1151 CCGCAAGCCT CAAAGAAGCC GACCAAGTGT TCTGNTACGC CGGCGGCGCG 

12 01 GACTGGGACG TTGCCGAAGC CCTCGCGCCT TTGGGCGGCA GGCTGCACGT 

1251 CGGCAAAGAC TTCGATGCCT TCGTTGCCGA AATCGTGAAA AACGCCGAAG 

10 1301 CAGGCGACCA TATTTTGGTG ATGAGCAACG GCGGTTTCGG CGGAATACAC 

1351 ACCAAACTGC TGGACGCTTT GAGATAG 

This encodes a protein having amino acid sequence [<SEQ ID 870>] (SEP ID NO: 870) : 

1 MKHIHIIGIG GTFMGGIAAI A KEAGFEXSG CDAKMYPPMS TQLEALGIGV 

15 51 YEGFDTAQLD EFKADVYVIG NVAKRGMDW EAILNRGLPY ISGPQWLAEN 

101 XLHHHWXLGV AXTHGKTTTA SMLAWVLEYA GLAPGFXIGG VPENFSVSAR 

151 LPQTPRQDPN SQSPFFVIEA DEYDTAFFDK RSKFVHYRPR TAVLNNLEFD 

201 HADIFADLGA IQTQFHHLVR TVPSEGLIVC NGRQQSLQDT LDKGCWTPVE 

251 KFGTEHGWQA GEANADGSFD VLLDGKKAGH VAWSLMGGHN RMNALAVIAA 

20 301 ARHAGVDIQT ACEALSTFKN VKRRMEIKGT ANGITVYDDF AHHPTAIETT 

351 IQGLRQRVGG ARILAVLEPR SNTMKLGTMK AALPASLKEA DQVFXYAGGA 

401 DWDVAEALAP LGGRLHVGKD FDAFVAEIVK NAEAGDH I L V MSNGGFGGIH 

4 51 TKLLDALR* 

25 ORF132a (SEP ID NO: 870) and ORF132-1 (SEP ID NO: 868) show 93.9% identity in 458 aa 
overlap: 

orf 132a . pep MKHIHI IGIGGTFMGGIAAIAKEAGFEXSGCDAKMYPPMSTQLEALGIGVYEGFDTAQLD 

Mill! IIIMIIIMIIIMI II llllllllllllllllllll IIMIhllM 

orf 132 - 1 MKHIHI IGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLD 

30 orf 132a . pep EFKADVYVI GNVAKRGMD WEA ILNRGLP Y I SGPQWLAENXLHHHWXLGVAXTHGKTTTA 

I II II I II 1 1 1 II 1 1 1 1 M II I II I llllllllllhll Mill MM IIIIMM 

orf 132 - 1 EFKADVYVIGOTAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTA 
orf 132a. pep SMLAWVLEYAGLAPGFXIGGVPENFSVSARLPQTPRQDPNSQSPFFVIEADEYDTAFFDK * 

IMIIIIIIIMIIII II II 1 1 1 h I II I II II 1 1 II 1 1 1 1 II I II II 1 1 II I II II 1 1 

35 orf 132-1 SMLAWVLEYAGLAPGFLIGGVPENFGVSARLPQTPRQDPNSQSPFFVIEADEYDTAFFDK 

orf 132a . pep RS KFVH YRPRTAVLNNLE FDHAD I FADLGAI QTQFHHL VRTVPSEGL I VCNGRQQS LQDT 

I II II II I II II 1 1 II II II II I II I II 1 1 1 1 M I h II I II M II I II II 1 1 1 1 1 II II 

orf 132-1 RS KFVH YRPRTAVLNNLE FDHAD I FADLGA I QTQFHYLVRTVPSEGL I VCNGRQQS LQDT 

orf 132a . pep LDKGCWTPVEKFGTEHGWQAGEANADGSFDVLLDGKKAGHVAWSLMGGHNRMNALAVIAA 

40 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 Ihl hill II MM MM II 

orfl32-l LDKGCWTPVE KFGTEHGWQAGEANADGS FDVLLDGKTAGRVKWDLMGRHNRMNALAVI AA 

orf 132a . pep ARHAGVD I QTACEALSTFKNVKRRME I KGTANG I TVYDDFAHH PTA I ETT IQGLRQRVGG 

M 1 : 1 1 1 1 1 li 1 1 1 - 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 

or f 1 3 2 - 1 ARHVGVD I QTACE ALGAF KNVKRRME I KGTANG I TVYDDFAHH PTA I ETT I QGLRQRVGG 



45 



orf 132a .pep 



orf 132-1 



AR I LAVLE PRSNTMKLGTMKAALPAS LKEADQ VFX YAGGADWDVAE ALAPLGGRLHVGKD 

Ml MIMIII Mill II I hill: Mill II II I Ml II III IIIIMM M 

ARILAVLEPRSNTMKLGTMKSALPVSLKEADQVFCYAGGVDWDVAEALAPLGGRLNVGKD 
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orf 132a . pep FDAFVAE I VKNAEAGDH ILVMSNGGFGG IHTKLLDALRX 

IIIIIIIIIMIhllllllllllllllll Mhllll 

orfl32-l FDAFVAE I VKNAEVGDH I LVM SNGGFGG I HGKLLEALRX 

Homology with a predicted ORF from N. gonorrhoeae 

5 ORF132 (SEP ID NO: 866) shows 89.6% identity over 259 aa overlap with a predicted ORF 
(ORF1 32ng) (SEP ID NO: 872) from N. gonorrhoeae: 

orf 132 .pep MKHIHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLD 60 

I I I I I I ! I I I I I I I I I : 1 1 I M 1 1 h 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 : 1 MINIM : 

orf 132ng MKHIHIIGIGGTFMGGIAAIAKEAGFKVSGCDAKMYPPMSTQLEALGIGVHEGFDAAQLE 60 

10 orf 132 .pep EFKADVYVIGNVAKRGMDWEAILNLGLPY I SGPQWLSENVLHHHWVLGVAGTHGKTTTA 120 

IhlhllllllhllMIIIIIII II I III llllhl MM I II Illlllllllllll 

orf 1 3 2 ng E FQAD I YVI GNVARRGMD WEAI LNRGL P Y I S G PQWLAENVLHHHWVLGVAGTHGKTTT A 120 

orf 132 . pep SMLAWVLEYAGLAPGFLIGGVXGKFRRFRPPAANAAPRPEQPIAVFRHRSRRIRHRLFRQ 180 

II M M I II II II 1 1 1 1 1 1 1 1 IMMMIhlMI MM MMMMMMMMM 

15 orf 132ng SMLAWVLEYAGLAPGFLIGGVPGKFRRFRPPTANAASRPEQQIAVFRHRSRRIRHRLFRQ 180 

orf 132 . pep TFXIRALPSAYRRVEQSGIRPRRHLCRLGRDTDPVPLPRAYRAVXRLNRLQRTAAKPARY 240 

I MM MMMMMMMM MMMMM 1 1 h I - I MMMMMMI 

orf 132ng TLQ I RALS P AYRRVEQSG I RPRRHLRRLGRDTDPVPPPRAHRT I RRPHRLQRTAAKPARY 240 

orf 132. pep FGQRLLDAGGKIRHGTRLA 259 

20 I MM II MM III MM 

orfl32ng FGQRLLDAGGKI RHRTRLADW 261 

An ORF132ng nucleotide sequence [<SEQ ID 87 1>] (SEP ID NO: 871) was predicted to encode a 
protein having amino acid sequence [<SEQ ID 872>] (SEP ID NO: 872) : 

25 1 MKHIHIIGIG GTFMGGIAAI A KEAGFKVSG CDAKMYPPMS TQLEALGIGV 

51 HEGFDAAQLE EFQADIYVIG NVARRGMDW EAILNRGLPY ISGPQWLAEN 

101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VPGKFRRFRP 

151 PTANAASRPE QQIAVFRHRS RRIRHRLFRQ TLQ I RALS PA YRRVEQSGIR 

201 PRRHLRRLGR DTDPVPPPRA HRTIRRPHRL QRTAAKPARY FGQRLLDAGG 

30 251 KIRHRTRLAD W* 

Further work revealed the following gonococcal DNA sequence [<SEQ ID 873>] fSEP ID NP: 
873) : 

1 ATGAAACACA TCCACATTAT CGGTATCGGC GGCACGTTTA TGGGCGGGAT 

35 51 TGCCGCCATT GCCAAAGAAG CCGGGTTCAA AGTCAGCGGT TGCGACGCGA 

101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG CATAGGCGTA 

151 CACGAAGGCT TCGATGCCGC GCAGTTGGAA GAATTTCAAG CCGATATTTA 

2 01 CGTCATCGGC AATGTCGCCA GGCGCGGGAT GGATGTGGTC GAGGCGATTT 

251 TGAACCGTGG GCTGCCTTAT ATTTCCGGCC CGCAATGGCT GGCTGAAAac 

40 301 GTGCtgcacc atcaTTGGgt ACTCGGCGTG GcagggaCGC ACGGcaaAac 

351 gaccaCcGcg tCCATGCTCG CCTGGGTCTT GGAATATGCC GGACTCGCGC 

4 01 CGGGCTTCCT CATCGGCGGt gtaccggaAA ATTTCGGCGT TTCCGCCCGC 

.4 51 CTACCGCAAA CGCCGCGTCA AGACCCGAAC AGCAAATCGC CGTTTTTCGT 
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501 CATCGAAGCC GACGAATACG ACACCGCCTT TTTCGACAAA CGCTCCAAAT 

551 TCGTGCATTA TCGCCCGCGT ACCGCCGTGT TGAACAATCT GGAATTCGAC 

601 CACGCCGACA TCTTCGCCGA CTTGGGCGCG ATACAGACCC AGTTCCACCA 

651 CCTCGTGCGC ACCGTACCAT CCGAAGGCCT CATCGTCTGC AACGGACAGC 

701 AGCAAAGCCT GCAAGATACT TTGGACAAAG GCTGCTGGAC GCCGGTGGAA 

751 AAATTCGGCA CCGGACACGG CTGGCAGATT GGTGAAGTCA ATGCCGACGG 

801 CTCGTTCGAC GTATTGCTTG ACGGCAAAAA AGCCGGACAC GTCGCATGGG 

851 ATTTGATGGG CGGACACAAC CGCATGAACG CGCTCGCCGT CATCGCTGCC 

901 GCACGCCATG CCGGAGTCGA TGTTCAGACG GCCTGCGAAG CCTTGGGTGC 

951 GTTTAAAAAC GTCAAACGCC GCATGGAAAT CAAAGGCACG GCAAACGGCA 

1001 TCACCGTTTA CGACGATTTC GCCCACCACC CGACCGCCAT CGAAACCACG 

1051 ATTCAAGGTT TGCGCCAACG TGTCGGCGGC GCGCGCATCC TCGCCGTCCT 

1101 CGAGCCGCGT TCCAACACCA TGAAACTCGG CACGATGAAG TCCGCCCTGC 

1151 CCGCAAGCCT CAAAGAAGCC GACCAAGTGT TCTGCTACGC CGGCGGCGCG 

1201 GACTGGGACG TTGCCGAAGC CCTCGCGCCT TTGGGCTGCA GGCTGCGCGT 

1251 CGGTAAAGAT TTCGATACCT TCGTTGCCGA AATTGTGAAA AACGCCCGAA 

13 01 CCGGCGACCA TATTTTGGTG ATGAGCAACG GCGGTTTCGG CGGAATACAC 

1351 ACCAAACTGC TGGACGCTTT GAGATAG 

This corresponds to the amino acid sequence [<SEQ ID 874; ORF132ng-l>] (SEP ID NO: 874; 
PRF132ng-l) : 

1 MKHIHIIGIG GTFMGGIAAI A KEAGFKVSG CDAKMYPPMS TQLEALGIGV 
51 HEGFDAAQLE EFQADIYVIG NVARRGMDW EAILNRGLPY ISGPQWLAEN 
101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VPENFGVSAR 
151 LPQTPRQDPN SKSPFFVIEA DEYDTAFFDK RSKFVHYRPR TAVLNNLEFD 
201 HADIFADLGA IQTQFHHLVR TVPSEGLIVC NGQQQSLQDT LDKGCWTPVE 
251 KFGTGHGWQI GEVNADGSFD VLLDGKKAGH VAWDLMGGHN RMNALAV I AA 
301 ARHAGVDVQT ACEALGAFKN VKRRME I KGT ANGITVYDDF AHHPTAIETT 
351 IQGLRQRVGG ARILAVLEPR SNTMKLGTMK SALPASLKEA DQVFCYAGGA 
401 DWDVAEALAP LGCRLRVGKD FDTFVAEIVK NARTGDH I L V MSNGGFGGIH 
4 51 TKLLDALR* 

PRF132ng-l (SEP ID NO: 874) and ORF132-1 (SEP ID NO: 868) show 93.2% identity in 458 aa 
overlap: 

orf 132ng-l.pep MKH I H 1 1 G I GGTFMGGI AAI AKEAGFKVSGCDAKMYPPMSTQLEALGI GVHEGFDAAQLE 

MMM MMMMMMMMMMMMMMMMMI MM MMMMM 

orf 132-1 MKHIHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLD 
orf 132ng- 1 . pep EFQAD I YVI GNVARRGMDWEAI LNRGLPY I SGPQWLAENVLHHHWVLGVAGTHGKTTTA 

MM MM MM MMMMMM 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 132 - 1 EFKADVYVIGNVAKRGMDWEAIIJSfLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTA 
orf 132ng-l.pep SMLAWVLEYAGLAPGFLIGGVPENFGVSARLPQTPRQDPNSKSPFFVIEADEYDTAFFDK 

1 1 1 MM I III I Mill 1 1 II II I II 1 1 II II I II II Nihil llllll MM MMI 

orf 132 - 1 SMLAWVLEYAGLAPGFLIGGVPENFGVSARLPQTPRQDPNSQSPFFVIEADEYDTAFFDK 
orf 132ng- 1 . pep RSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHHLVRTVPSEGLIVCNGQQQSLQDT 

M I M M M M M M M M M I I M M M M I MM M M M M M M MM M M 

orf 132-1 RSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHYLVRTVPSEGLIVCNGRQQSLQDT 
orf 132ng-l .pep LDKGCWTPVEKFGTGHGWQIGEVNADGSFDVLLDGKKAGHVAWDLMGGHNRMNALAVIAA 

1 1 1 1 1 II I II II 1 1 MM IhlMMIMIMM MM Mill IMMMMIM 

orf 132 - 1 LDKGCWTPVEKFGTEHGWQAGEANADGSFDVLLDGKTAGRVKWDLMGRHNRMNALAVIAA 
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orf 132ng-l .pep ARHAGVDVQTACEALGAFKNVKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG 

llhll hlllll IIIMIIMIIIMI IIMMMIMMIMIII IMIIIMIIMI 

orf 132 - 1 ARHVGVDIQTACEALGAFKNVKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG 
orf 132ng- 1 . pep ARILAVLEPRSNTMKIjGTMKSALPASLKEADQVFCYAGGADWDVAEALAPLGCRLRVGKD 

M 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 Ml I M i 1 1 1 1 1 1 1 1 i hi M 1 1 1 1 i I ; 1 1 II 1 1 1 1 

orf 132-1 ARILAVLEPRSNTMKLGTMKSALPVSLK^ADQVFCYAGGVDWDVAEALAPLGGRLNVGKD 

orf 132ng-l .pep FDTFVAEIVKNARTGDHILVMSNGGFGGIHTKLLDALRX 
I M I I I I I I I I- M I I I II I I I I I I I llhllll 
orf 132- 1 FDAFVAEIVKNAEVGDHILVMSNGGFGGIHGKLLEALRX 

In addition, ORF132ng-l (SEP ID NO: 874) is homologous to a hypothetical E.coli protein (SEQ 
ID NO: 1166) : 

pir| |S56459 hypothetical protein o457 - Escherichia coli )gi|537075 (U14003) 
ORF_o457 [Escherichia coli] )gi| 1790680 (AE000494) hypothetical 48.5 kD protein in 
15 fbp-pmba intergenic region [Escherichia coli] Length = 457 

Score = 474 bits (1207), Expect = e-133 

Identities = 249/439 (56%), Positives = 294/439 (66%), Gaps = 13/439 (2%) 



10 



20 



25 P NF VSA L +S FFVIEADEYD AFFDKRSKFVHY PRT +LNNLEFDH 

NTFEVSAHL GESDFFVI EADE YDCAFFDKRS KFVHYCPRTL I LNNLEFDH 190 



30 



35 



Query: 


22 


Sbjct : 


21 


Query : 


82 


Sbjct : 


80 


Query : 


142 


Sbjct : 


140 


Query: 


202 


Sbjct : 


191 


Query: 


262 


Sbjct : 


251 


Query: 


321 


Sbjct: 


311 


Query: 


380 


Sbjct : 


371 


Query : 


439 


Sbjct: 


431 



++ G +V+G DA +YPPMST LE GI + +G+DA+QLE Q D+ +IGN RG VE 



A+L + +PY+SGPQWL + VL WVL VAGTHGKTTTA M W+LE G PGF+IGGV 



ADIF DL AIQ QFHHLVR VP +G 1+ +L+ T+ GCW+ EG WQ 



++ D S ++VLLDG+K G V W L+G HN N L IAAARH GV A ALG+F N 



+RR+E++G ANG+TVYDDFAHHPTAI T+ LR +VGG ARI + AVLEPRSNTMK+G 



K L SL AD+VF W VAE D DT +VK A+ GDHI 



40 LVMSNGGFGGIH KLLD L 



Based on this analysis, it was predicted that these proteins from N. meningitidis and N. gonorrhoeae, 
and their epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 
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ORF132-1 (SEP ID NO: 868) (26.4kDa) was cloned in pET and pGex vectors and expressed in 
E.colU as described above. The products of protein expression and purification were analyzed by 
SDS-PAGE. Figure 20A shows the results of affinity purification of the His-fusion protein, and 
Figure 20B shows the results of expression of the GST-fusion in E.colL Purified His-fusion protein 
was used to immunise mice, whose sera were used for FACS analysis (Figure 20C) and ELISA 
(positive result). These experiments confirm that ORF132 (SEP ID NO: 866) is a surface-exposed 
protein, and that it is a useful immunogen. 

Example 103 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 875>] (SEP ID 
NO: 875) 

1 . . CCGGGCTATT ACGGCTCGGA TGACGAATTT AAGCGGGCAT TCGGAGAAAA 

51 CTCGCCGACA TmCAAGAAAC ATTGCAACCG GAGCTGCGGG ATTTATGAAC 

101 CCGTATTGAA AAAATACGGC AAAAAGCGCG CCAACAACCA TTCGGTCAGC 

151 ATTAGTGCGG ACTTCGGCGA TTATTTCATG CCGTTCGCCA GCTATTCGCG 

201 CACACACCGT ATGCCCAACA TCCAAGAAAT GTATTTTTCC CAAATCGGCG 

251 ACTCCGGCGT TCACACCGCC TTAAAACCAG AGCGCGCAAA CACTTGGCAA 

3 01 TTTGGCTTCr ATACCTATAA AAAAGGATTG TTAAAACAAG ATGATACATT 
351 AGGATTAAAA CTGGTCGGCT ACCGCAGCCG CATCGACAAC TACATCCACA 

4 01 ACGTTTACGG GAAATGGTGG GATTTGAACG GGGATATTCC GAGCTGGGTC 
4 51 AGCAGCACCG GGCTTGCCTA CACCATCCAA CATCGCrATT TCAwAGACAA 
501 AGTGCATCAA nnnnnnnnnn nnnnnnnnnn nnnnTACGAT TATGGGCGTT 
551 TTTTCACCAA CCTTTCTTAC GCCTATCAAA AAAGCACGCA ACCGACCAAC 
601 TTCAGCGATG CGAGCGAATC GCCCAACAAT GCGTCCAAAG AAGACCAACT 
651 CAAACAAGGT TATGGGTTGA GCAGGGTTTC CGCCCTGCCG CGAGATTACG 
701 GACGTTTGGA AGTCGGTACG CGCTGGTTGG GCAACAAACT GACTTTGGGC 
751 GGCGCGATGC GCTATTTCGG CAAGAGCATC CGCGCGACGG CTGAAGAACG 
801 CTATATCGAC GGCACCAACG GGGGAAATAC CAGCAATTTC CGGCAACTGG 
851 GCAAGCGTTC CATCAAACAA ACCGAAACTC TTGCCCGCCA GCCTTTGATT 
901 TTwGATTTTa ACGCCGCTTA CGAGCCGAAG "AAAAACCTTA TTTTCCGCGC 
951 CGAAGTCAAA AATCTGTTCG ACAGGCGTTA TATCGATCCG CTCGATGCGG 

1001 GCAATGATGC GGCAAC . GAG CGTTATTACA GCTCGTTCGA CCCGAAAGAC 

1051 AAGGACrrAG ACGTAACGTG TAATGCTGAT AAAACGTTGT GCaACGGCAA 

1101 ATACGGCGGC ACAAGCAAAA GCGTATTGAC CAATTTTGCA CGCGGACGCA 

1151 CCTTTTTgAT GACGATGAGC TACAAGTTTT AA 

This corresponds to the amino acid sequence [<SEQ ID 876; ORF133>] (SEP ID NO: 876; 
ORF133) : 

1 . . PGYYGSDDEF KRAFGENSPT XKKHCNRSCG IYEPVLKKYG KKRANNHSVS 

51 ISADFGDYFM PFASYSRTHR MPNIQEMYFS QIGDSGVHTA LKPERANTWQ 

101 FGFXTYKKGL LKQDDTLGLK LVGYRSRIDN YIHNVYGKWW DLNGDIPSWV 

151 SSTGLAYTIQ HRXFXDKVHQ XXXXXXXXYD YGRFFTNLSY AYQKSTQPTN 

201 FSDASESPNN ASKEDQLKQG YGLSRVSALP RDYGRLEVGT RWLGNKLTLG 

251 GAMRYFGKSI RATAEERYID GTNGGNTSNF RQLGKRSIKQ TETLARQPLI 

301 XDFNAAYEPK KNLIFRAEVK NLFDRRYIDP LDAGNDAAXE RYYSSFDPKD 
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351 KDXDVTCNAD KTLCNGKYGG TSKSVLTNFA RGRTFLMTMS YKF* 

Further work revealed the further partial DNA sequence [<SEQ ID 877>] (SEP ID NO: 877) : 



1 GAGGCGCAGA TACAGGTTTT GGAAGATGTG CACGTCAAGG CGAAGCGCGT 

51 ACCGAAAGAC AAAAAAGTGT TTACCGATGC GCGTGCCGTA TCGACCCGTC 

101 AGGATATATT CAAATCCAGC GAAAACCTCG ACAACATCGT ACGCAGCATC 

151 CCCGGTGCGT TTACACAGCA AGATAAAAGC TCGGGCATTG TGTCTTTGAA 

201 TATTCGCGGC GACAGCGGGT TCGGGCGGGT CAATACGATG GTGGACGGCA 

251 TCACGCAGAC CTTTTATTCG ACTTCTACCG ATGCGGGCAG GGCAGGCGGT 

301 TCATCTCAAT TCGGTGCATC TGTCGACAGC AATTTTATTG CCGGACTGGA 

351 TGTCGTCAAA GGCAGCTTCA GCGGCTCGGC AGGCATCAAC AGCCTTGCCG 

4 01 GTTCGGCGAA TCTGCGGACT TTAGGCGTGG ATGACGTCGT TCAGGGCAAT 

4 51 AATACCTACG GCCTGCTGCT AAAAGGTCTG ACCGGCACCA ATTCAACCAA 

501 AGGTAATGCG ATGGCGGCGA TAGGTGCGCG CAAATGGCTG GAAAGCGGAG 

551 CATCTGTCGG TGTGCTTTAC GGGCACAGCA GGCGCAGCGT GGCGCAAAAT 

'601 TACCGCGTGG GCGGCGGCGG GCAGCACATC GGAAATTTTG GCGCGGAATA 

651 TTTGGAACGG CGCAAGCAGC GATATTTTGT ACAAGAGGGT GCTTTGAAAT 

701 TCAATTCCGA CAGCGGAAAA TGGGAGCGGG ATTTACAAAG GCAACAGTGG 

751 AAATACAAGC CGTATAAAAA TTACAACAAC CAAGAACTAC AaAAATACAT 

801 CGAAGAGCAT GACAAAAGCT GGCGGGAAAA CCTg . CaCCG CAATACGACA 

851 TTACCCCCAT CGATCCGTCC AGCCTGAAGC AGCAGTCGGC AGGCAATCTG 

901 TTTAAATTGG AATACGACGG CGTATTCAAT AAATACACGG CGCAATTTCG 

951 CGATTTAAAC ACCAAAATCG GCAGCCGCAA AATCATCAAC CGCAATTATC 

1001 AGTTCAATTA CGGTTTGTCT TTGAACCCGT ATACCAACCT CAATCTGACC 

1051 GCAGCCTACA ATTCGGGCAG GCAGAAATAT CCGAAAGGGT CGAAGTTTAC 

1101 AGGCTGGGGG CTTTTAAAGG ATTTTGAAAC CTACAACAAC GCGAAAATCC 

1151 TCGACCTCAA CAACACCGCC ACCTTCCGGC TGCCCCGCGA AACCGAGTTG 

1201 CAAACCACTT TGGGCTTCAA TTATTTCCAC AACGAATACG GCAAAAACCG 

1251 CTTTCCTGAA GAATTGGGGC TGTTTTTCGA CGGTCCTGAT CAGGACAACG 

1301 GGCTTTATTC CTATTTGGGG CGGTTTAAGG GCGATAAAGG GCTGCTGCCC 

1351 CAAAAATCAA CCATTGTCCA ACCGGCCGGC AGCCAATATT TCAACACGTT 

14 01 CTACTTCGAT GCCGCGCTCA AAAAAGACAT TTACCGCTTA AACTACAGCA 

1451 CCAATACCGT CGGCTACCGT TTCGGCGGCG AATATACGGG CTATTACGGC 

1501 TCGGATGACG AATTTAAGCG GGCATTCGGA GAAAACTCGC CGACATACAA 

1551 GAAACATTGC AACCGGAGCT GCGGGATTTA TGAACCCGTA TTGAAAAAAT 

1601 ACGGCAAAAA GCGCGCCAAC AACCATTCGG TCAGCATTAG TGCGGACTTC 

1651 GGCGATTATT TCATGCCGTT CGCCAGCTAT TCGCGCACAC ACCGTATGCC 

1701 CAACATCCAA GAAATGTATT TTTCCCAAAT CGGCGACTCC GGCGTTCACA 

1751 CCGCCTTAAA ACCAGAGCGC GCAAACACTT GGCAATTTGG CTTCAATACC 

1801 TATAAAAAAG GATTGTTAAA ACAAGATGAT ACATTAGGAT TAAAACTGGT 

1851 CGGCTACCGC AGCCGCATCG ACAACTACAT CCACAACGTT TACGGGAAAT 

1901 GGTGGGATTT GAACGGGGAT ATTCCGAGCT GGGTCAGCAG CACCGGGCTT 

1951 GCCTACACCA TCCAACATCG CAATTTCAAA GACAAAGTGC ACAAACACGG 

2001 TTTTGAGTTG GAGCTGAATT ACGATTATGG GCGTTTTTTC ACCAACCTTT 

2051 CTTACGCCTA TCAAAAAAGC ACGCAACCGA CCAACTTCAG CGATGCGAGC 

2101 GAATCGCCCA ACAATGCGTC CAAAGAAGAC CAACTCAAAC AAGGTTATGG 

2151 GTTGAGCAGG GTTTCCGCCC TGCCGCGAGA TTACGGACGT TTGGAAGTCG 

22 01 GTACGCGCTG GTTGGGCAAC AAACTGACTT TGGGCGGCGC GATGCGCTAT 

2251 TTCGGCAAGA GCATCCGCGC GACGGCTGAA GAACGCTATA TCGACGGCAC 

2301 CAACGGGGGA AATACCAGCA ATTTCCGGCA ACTGGGCAAG CGTTCCATCA 

2351 AACAAACCGA AACTCTTGCC CGCCAGCCTT TGATTTTTGA TTTTTACGCC 

24 01 GCTTACGAGC CGAAGAAAAA CCTTATTTTC CGCGCCGAAG TCAAAAATCT 

24 51 GTTCGACAGG CGTTATATCG ATCCGCTCGA TGCGGGCAAT GATGCGGCAA 

2501 CGCAGCGTTA TTACAGCTCG TTCGACCCGA AAGACAAGGA CGAAGACGTA 

2551 ACGTGTAATG CTGATAAAAC GTTGTGCAAC GGCAAATACG GCGGCACAAG 

2601 CAAAAGCGTA TTGACCAATT TTGCACGCGG ACGCACCTTT TTGATGACGA 

2651 TGAGCTACAA GTTTTAA 
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This corresponds to the amino acid sequence [<SEQ ID 878; ORF133-l>] (SEP ID NO: 878; 
PRF133-1) : 

1 EAQIQVLEDV HVKAKRVPKD KKVFTDARAV STRQDIFKSS ENLDNIVRSI 

51 PGAFTQQDKS SGIVSLNIRG DSGFGRVNTM VDGITQTFYS TSTDAGRAGG 

5 101 SSQFGASVDS NFIAGLDWK GSFSGSAGIN SLAGSANLRT LGVDDWQGN 

151 NTYGLLLKGL TGTNSTKGNA MAAIGARKWL ESGASVGVLY GHSRRSVAQN 

201 YRVGGGGQHI GNFGAEYLER RKQRYFVQEG ALKFNSDSGK WERDLQRQQW 

251 KYKPYKNYNN QELQKYIEEH DKSWRENLXP QYDITPIDPS SLKQQSAGNL 

301 FKLEYDGVFN KYTAQFRDLN TKIGSRKIIN RNYQFNYGLS LNPYTNLNLT 

10 351 AAYNSGRQKY PKGSKFTGWG LLKDFETYNN AKILDIiNNTA TFRLPRETEL 

4 01 QTTLGFNYFH NEYGKNRFPE ELGLFFDGPD QDNGLYSYLG RFKGDKGLLP 

4 51 QKSTIVQPAG SQYFNTFYFD AALKKDIYRL NYSTNTVGYR FGGEYTGYYG 

501 SDDEFKRAFG ENSPTYKKHC NRSCGIYEPV LKKYGKKRAN NHSVSISADF 

551 GDYFMPFASY SRTHRMPNIQ EMYFSQIGDS GVHTALKPER ANTWQFGFNT 

15 601 YKKGLLKQDD TLGLKLVGYR SRIDNYIHNV YGKWWDLNGD IPSWVSSTGL 

651 AYTIQHRNFK DKVHKHGFEL ELNYDYGRFF TNLSYAYQKS TQPTNFSDAS 

701 ESPNNASKED QLKQGYGLSR VSALPRDYGR LEVGTRWLGN KLTLGGAMRY 

751 FGKSIRATAE ERYIDGTNGG NTSNFRQLGK RSIKQTETLA RQPLIFDFYA 

801 AYEPKKNLIF RAEVKNLFDR RYIDPLDAGN DAATQRYYSS FDPKDKDEDV 

20 851 TCNADKTLCN GKYGGTSKSV LTNFARGRTF LMTMSYKF* 

Computer analysis of this amino acid sequence gave the following results: 

Homology with with the probable TonB-dependent receptor HI121 of H.influenzae (accession 
number U32801) (SEP ID NO: 1 167) 

25 ORF133 (SEP ID NO: 876) and HI121 (SEP ID NO: 1167) show 57% aa identity in 363aa 
overlap: 

Orf 133 : 31 IYEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTA 90 

I EP+L K G K+A NHS ++SA+ DYFMPF +YSRTHRMPNIQEM+FSQ+ ++GV+TA 
HI121: 563 INEPILHKSGHKKAFNHSATLSAELSDYFMPFFTYSRTHRMPNIQEMFFSQVSNAGVNTA 622 

30 Orf 133: 91 LKPERANTWQFGFXTYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWV 150 

LKPE+++T+Q GF TYKKGL QDD LG+KLVGYRS I NYIHNVYG WW +P+W 
HI121: 623 L KP E QS D T YQ LG FNT Y KKGLFTQDDVLGVKLVGYRSF I KNY I HNVYGVWW- -RDGMPTWA 680 

Orf 133: 151 SSTGLAYTIQHRXFXDKVHXXXXXXXXXYDYGRFFTNLSYAYQKSTQPTNFSDASESPNN 210 
S G YTI H+ + V YD GRFF N+SYAYQ++ QPTN++DAS PNN 

35 HI121: 681 ESNGFKYTIAHQNYKPIVKKSGVELEINYDMGRFFANVSYAYQRTNQPTNYADASPRPNN 740 

Orf 133: 211 AS KEDQLKQGYGLSRVS ALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKS I RATAEERY I D 270 

AS+ED LKQGYGLSRVS LP+DYGRLE+GTRW KLTLG A RY+GKS RAT EE YI + 
HI121: 741 ASQEDILKQGYGLSRVSMLPKDYGRLELGTRWFDQKLTLGLAARYYGKSKRATIEEEYIN 800 

Orf 133: 271 GTNGGNTSNFRQLGKRSIKQTETLARQPLIXDFNAAYEPKKNLIFRAEVKNLFDRRYIDP 330 
40 G+ + R+ ++K+TE + +QP+I D + +YEP K+LI +AEV+NL D+RY+DP 

HI121: 801 GSR- FKKNTLRRENYYAVKKTEDIKKQPI ILDLHVSYEPIKDLI IKAEVQNLLDKRYVDP 859 
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Orfl33: 331 LDAGNDAAXERYYSSFDPKDKDXDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMS 390 

LDAGNDAA +RYYSS + + C D + .C GG+ K+VL NF ARGRT + + + + + + 
HI121: 860 LDAGNDAASQRYYSSL NNS IECAQDSSAC GGSDKTVL YNF ARGRT Y I LS LN 910 

Orfl33: 391 YKF 393 
5 YKF 

HI121: 911 YKF 913 

Homology with a predicted ORF from N meningitidis (strain A) 

ORF133 (SEP ID NO: 876) shows 90.8% identity over a 392aa overlap with an ORF (ORF133a) 
(SEP ID NO: 880) from strain A of N. meningitidis: 

10 10 20 30 

orf 133 .pep PGYYGSDDE FKRAFGENS PTXKKHCNRS CG I 

III 1 1 1 1 1 1 1 M 1 1 1 1 1 lllhllll 

or f 13 3a F Y FDAAL KKD I YRLNYSTNTVGYRFGGXYTGYYXSDDE FKRAFGENS PTYXKHCNQS CGI . 

450 460 470 480 490 500 

15 40 50 60 70 80 90 

orf 133 . pep YEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTAL 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I II I I I I I I I I I 
orf 133a YEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTAL 
510 520 530 540 550 560 

20 100 110 120 130 140 150 

orf 133 .pep KPERANTWQFGFXTYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVS 

llllllllllll II llllll MM lllillll Ml 1 1 II IM IMMIII II 

orf 133a KPERANTWQFGFNTYKKGLLKQDD I LGLKLVGYRSRI DXY I HNVYGKWWDLNGNI PS WVS 

570 580 590 600 610 620 

25 160 170 180 190 200 210 

orf 133 . pep STGLAYTIQHRXFXDKVHQXXXXXXXXYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNA 

I llllll I I I I | ||||: Ml I I I I I I I I I I I I I II I I II I Ml I I I I 

orf 133a STGLAYTIQHRNFKDKVHKHGFELELNYDYXRFFTNLSYAYQKSTQPTNFSDASESPNNA 
630 640 650 660 670 680 

30 220 230 240 250 260 270 

orf 133 .pep S KEDQLKQGYGLSRVS ALPRDYGRLE VGTRWLGNKLTLGGAMRYFGKS I RATAEERY I DG 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M 
or f 1 3 3 a S KEDQLKQGYGLSRVS ALPRDYGRLE VGTRWLGNKLTLGGAMRYFGKS I RATAEERY I DX 

690 700 710 720 730 740 

35 280 290 300 310 320 330 

orf 133 . pep TNGGNTSNFRQLGKRSIKQTETLARQPLIXDFNAAYEPKKNLIFRAEVKNLFDRRYIDPL 

III llllllllllll 1 1 1 1 1 1 1 1 1 . I Mill II I II Mill II I llllll 

orf 133a TNGXXTSNFRQLGKRS IXQTETLARQPLI FDXYAAYEPKKXLI FRAEVKNLFDRRYIDPL 

750 760 770 780 790 800 

40 340 350 360 370 380 390 

or f 1 3 3 . pep DAGNDAAXERYYSSFDPKDKDXDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMS Y 

I M II I MM I III M M 1 1 1 Mill M I M I II I M I M II M 1 1 1 II IMMIII 

or f 1 3 3 a DAGNDAATQRYYSSFDPKDKDEEVTCNDDNTLCNGKYGGTSKSVLTNFARGXTFLITMSY 
810 820 830 840 850 860 
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orf 133 .pep KFX 
III 

orfl33a KFX 
870 

A partial ORF133a nucleotide sequence [<SEQ ID 879>] (SEP ID NO: 879) is: 



1 AAAGACAAAA AAGTGTTTAC CGATGCGCGT GCCGTATCGA CCCGTCAGGA 

51 TATATTCAAA TCCANCGAAA ACCTCGACAA CATCGTACGC ANCATCCCCG 

101 GTGCGTTTAC ACANCAANAT AAAAGCTCGG GCNTTGTGTC TTTGAATATT 

151 CGCNGCGACA GCGGGTTCGG GCGGGTCAAT ACNATGGTNG ACGGCATCAC 

201 NCANACCTTT TATTCGACTT CTACCGATGC GGGCAGGGCA GGCGGTTCAT 

251 CTCAATTCGG TGCATCTGTC GACAGCAATT TTATNGCCGG ACTGGATGTC 

301 GTCAAAGGCA GCTTCAGCGG CTCGGCAGGC ATCAACAGCC TTGCCGGTTC 

3 51 GGCGAATCTG CGGACTTTAN GCGTGGATGA TGTCGTTCAG GGCAATANTA 

4 01 CNTACGGCCT GCTGCTAAAA GGTCTGACCG GCACCAATTC AACCAAAGGT 
4 51 AATGCGATGG CGGCGATAGG TGCGCGCAAA TGGCTGGAAA GCGGAGCATC 
501 TGTCGGTGTG CTTTACGGGC ACAGCAGGCG CAGCGTGGCG CAAAATTACC 
551 GCGTGGGCGG CGGCGGGCAG CACATCGGAA ATTTTGGCGC GGAATATCTG 
601 GAACGACGCA AGCAACGATA TTTTGAGCAA GAAGGCGGGT TGAAATTCAA 
651 TTCCAACAGC GGAAAATGGG AGCGGGATTT CCAAAAGTCG TACTGGAAAA 
701 CCAAGTGGTA TCAAAAATAC GATGCCCCCC AAGAACTGCA AAAATACATC 
751 GAAGGTCATG ATAAAAGCTG GCGGGAAAAC CTGGCGCCGC AATACGACAT 

" 801 CACCCCCATC GATCCGTCCA GCCTGAAGCN GGAGTCGGCA GGCAACCTGT 

8 51 TTAAATTGGA ATACGACGGC GTATTCAATA AATACACGGC GCAATTTCGC 

901 GATTTAAACA CCAAAATCGG CAGCCGCAAA ATCATCAACC GCAATTATCA 

951 ATTCAATTAC GGTTTGTCTT TGAACCCGTA TACCAACCTC AATCTGACCG 

1001 CAGCCTACAA TTCGGGCAGG CAGAAATATC CGAAAGGGTC GAAGTTTACA 

1051 GGCTGGGGGC TTTTNAAAGA TTTTGAAACC TACAACAACG CAAAAATCCT 

1101 CGACCTCANC AACACCTCCA CCTTCCGGCT GCCCCGTGAA ACCGAGTTGC 

1151 AAACCACTTT GGGCTTCAAT TATTTCCACA ACGAATACGG CAAAAACCGC 

12 01 TTTCCTGAAG AATTGGGGCT GTTTTTCGAC GGTCCGGATC ANGACAACGG 

12 51 GCTTTATTCC TATTTGGGGC GGTTTAAGGG CGATAAAGGG CTGCTGCCCC 

13 01 AAAAATCAAC CATTGTCCAA CCGGCCGGCA GCCAATATTT CAACACGTTC 
1351 TACTTCGATG CCGCGCTCAA AAAAGACATT TACCGCTTAA ACTACAGCAC 

14 01 CAATACCGTC GGCTACCGTT TCGGCGGCNA ATATACGGGC TATTACNGCT 
14 51 CGGATGACGA ATTTAAGCGG GCATTCGGAG AAAACTCGCC GACATACANG 
1501 AAACATTGCA ACCAGAGCTG CGGAATTTAT GAACCCGTAT TGAAAAAATA 
1551 CGGCAAAAAG CGCGCCAACA ACCATTCGGT CAGCATTAGT GCGGACTTCG 
1601 GCGATTATTT CATGCCGTTC GCCAGCTATT CGCGCACACA CCGTATGCCC 
1651 AACATCCAAG AAATGTATTT TTCCCAAATC GGCGACTCCG GCGTTCACAC 
1701 CGCCTTAAAA CCAGAGCGCG CAAACACTTG GCAATTTGGC TTCAATACCT 
1751 ATAAAAAAGG ATTGTTAAAA CAAGATGATA TATTAGGATT AAAACTGGTC 
1801 GGCTACCGCA GCCGCATCGA CNACTACATC CACAACGTTT ACGGGAAATG 
1851 GTGGGATTTG AACGGGAATA TTCCGAGCTG GGTCAGCAGC ACCGGGCTTG 
1901 CCTACACCAT CCAACACCGC AATTTCAAAG ACAAAGTGCA CAAACACGGT 
1951 TTTGAGTTGG AGCTGAATTA CGATTATNGG CGTTTTTTCA CCAACCTTTC 
2001 TTACGCCTAT CAAAAAAGCA CGCAACCGAC CAACTTCAGC GATGCGAGCG 
2051 AATCGCCCAA CAATGCGTCC AAAGAAGACC AACTCAAACA AGGTTATGGG 
2101 TTGAGCAGGG TTTCCGCCCT GCCGCGAGAT TACGGACGTT TGGAAGTCGG 
2151 TACGCGCTGG TTGGGCAACA . AACTGACTTT GGGCGGCGCG ATGCGCTATT 

22 01 TCGGCAAGAG CATCCGCGCG ACGGCTGAAG AACGCTATAT CGACGNCACC 
2251 AATGGGGNAN NTACCAGCAA TTTCCGGCAA CTGGGCAAGC GTTCCATCAN 
2301 ACAAACCGAA ACCCTTGCCC GCCAGCCTTT GATTTTTGAT TTNTACGCCG 

23 51 CTTACGAGCC GAAGAAAAAN CTTATTTTCC GCGCCGAAGT CAAAAATCTG 
2401 TTCGACAGGC GTTATATCGA TCCGCTCGAT GCGGGCAATG ATGCGGCAAC 
2451 GCAGCGTTAT TACAGTTCGT TCGACCCGAA AGACAAGGAC GAAGAAGTAA 
2501 CGTGTAATGA TGATAACACG TTATGCAACG GCAAATACGG CGGCACAAGC 
2551 AAAAGCGTAT TGACCAATTT TGCACGCGGA CNCACCTTTT TGATAACGAT 
2601 GAGCTACAAG TTTTAA 
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This encodes a protein having (partial) amino acid sequence [<SEQ ID 880>] (SEP ID NO: 880) : 

1 KDKKVFTDAR AVSTRQDIFK SXENLDNIVR XIPGAFTXQX KSSGXVSLNI 

51 RXDSGFGRVN TMVDGITXTF YSTSTDAGRA GGSSQFGASV DSNFXAGLDV 

5 101 VKGSFSGSAG INSLAGSANL RTLXVDDWQ GNXTYGLLLK GLTGTNSTKG 

151 NAMAAIGARK WLESGASVGV LYGHSRRSVA QNYRVGGGGQ HIGNFGAEYL 

2 01 ERRKQRYFEQ EGGLKFNSNS GKWERDFQKS YWKTKWYQKY DAPQELQKYI 

251 EGHDKSWREN LAPQYDITPI DPSSLKXQSA GNLFKLEYDG VFNKYTAQFR 

301 DLNTKIGSRK I INRNYQFNY GLSLNPYTNL NLTAAYNSGR QKYPKGSKFT 

10 351 GWGLXKDFET YNNAKILDLX NTSTFRLPRE TELQTTLGFN YFHNEYGKNR 

4 01 FPEELGLFFD GPDXDNGLYS YLGRFKGDKG LLPQKSTIVQ PAGSQYFNTF 

4 51 YFDAALKKDI YRLNYSTNTV GYRFGGXYTG YYXSDDEFKR AFGENSPTYX 

501 KHCNQSCGIY EPVLKKYGKK RANNHSVSIS ADFGDYFMPF ASYSRTHRMP 

551 NIQEMYFSQI GDSGVHTALK PERANTWQFG FNTYKKGLLK QDDILGLKLV 

15 601 GYRSRIDXYI HNVYGKWWDL NGNIPSWVSS TGLAYTIQHR NFKDKVHKHG 

651 FELELNYDYX RFFTNLSYAY QKSTQPTNFS DASESPNNAS KEDQLKQGYG 

701 LSRVSALPRD YGRLEVGTRW LGNKLTLGGA MRYFGKSIRA TAEERYIDXT 

751 NGXXTSNFRQ LGKRSIXQTE TLARQPLIFD XYAAYEPKKX LIFRAEVKNL 

801 FDRRYIDPLD AGNDAATQRY YSSFDPKDKD EEVTCNDDNT LCNGKYGGTS 

20 851 KSVLTNFARG XTFLITMSYK F* 

ORF133a (SEP ID NO: 880) and ORF133-1 (SEP ID NO: 878) show 94.3% identity in 871 aa 
overlap: 

.10 20 30 40 

25 orf 133a. pep KDKKVFTDARAVSTRQD I FKSXENLDN I VRX I PGAFTXQXKS 

IIIIIIIIIIIMIIIIIIII Mill I Mill I M 

orf 133 - 1 EAQ I QVLEDVHVKAKRVP KDKKVFTDARAVSTRQD I FKSSENLDN I VRS I PGAFTQQDKS 

10 20 30 40 50 60 

50 60 70 80 90 100 

30 orf 133a . pep SGXVSLNIRXDSGFGRVNTMVDGITXTFYSTSTDAGRAGGSSQFGASVDSNFXAGLDWK 

II llllll MINIM INI 1 1 1 II II 1 1 1 1 II 1 1 1 1 1 1 1 1 M I lllllll 

orf 133 - 1 SGIVSLNIRGDSGFGRVNTMVDGITQTFYSTSTDAGRAGGSSQFGASVDSNFIAGLDWK 

70 80 90 100 110 120 

110 120 130 140 150 160 

35 orf 133a . pep GSFSGSAGINSLAGSANLRTLXVDDWQGNXTYGLLLKGLTGTNSTKGNAMAAIGARKWL 

1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

orf 133 - 1 GSFSGSAGINSLAGSANLRTLGVDDWQGNNTYGLLLKGLTGTNSTKGNAMAAIGARKWL 

130 140 150 160 170 180 

170 180 190 200 210 220 

40 orf 133a . pep ESGASVGVLYGHSRRSVAQNYRVGGGGQHIGNFGAEYLERRKQRYFEQEGGLKFNSNSGK 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I • I I I I I : I I I 
orf 133 - 1 ESGASVGVLYGHSRRSVAQNYRVGGGGQHIGNFGAEYLERRKQRYFVQEGALKFNSDSGK 

190 200 210 220 230 240 

230 240 250 260 270 280 

45 orf 133a . pep WERDFQKSYWKTKWYQKYDAPQELQKYIEGHDKSWRENLAPQYDITPIDPSSLKXQSAGN 

IIIMM: II I l-M MIMI MINIM 1 1 1 1 1 1 II 1 1 1 1 1 1 Mill 

orf 133-1 WERDLQRQQWKYKPYKNYNN-QELQKYIEEHDKSWRENLXPQYDITPIDPSSLKQQSAGN 

250 260 270 280 290 



50 



orf 133a .pep 



290 300 310 320 330 340 

LFKLEYDGVFNKYTAQFRDLNTKIGSRKI INRNYQFNYGLSLNPYTNLNLTAAYNSGRQK 
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! I I L 1 1 1 1 i 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 i 1 1 1 i 1 1 1 1 1 1 1 i E 1 1 1 1 1 

orf 1 3 3 - 1 LFKLEYDGVFNKYTAQFRDLNTKIGSRKI INRNYQFNYGLSLNPYTNLNLTAAYNSGRQK 

300 310 320 330 340 350 

350 360 370 380 390 400 

orf 133a . pep YPKGSKFTGWGLXKDFETYNNAKILDLXNTSTFRLPRETELQTTLGFNYFHNEYGKNRFP 

Mllllllllll 1 1 1 1 1 Ml IMMI IMIMI IIIIIIMII MM MM 

orf 133 - 1 YPKGSKFTGWGLLKDFETYNNAKILDLNNTATFRLPRETELQTTLGFNYFHNEYGKNRFP 
360 370 380 390 400 410 



10 



410 420 430 440 450 460 

orf 133a . pep EELGLFFDGPDXDNGLYSYLGRFKGDKGLLPQKSTIVQPAGSQYFNTFYFDAALKKDIYR 

lllllllllll 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 ! 1 1 i I I I ! I i I I I I I I ^ I [ I I 1 ! I I 

orf 133-1 EELGLFFDGPDQDNGLYSYLGRFKGDKGLLPQKSTIVQPAGSQYFNTFYFDAALKKDIYR 
420 430 440 450 460 470 



15 



470 480 490 500 510 520 

orf 133a . pep LNYSTNTVGYRFGGXYTGYYXSDDEFKRAFGENSPTYXKHCNQSCGIYEPVLKKYGKKRA 

Mllllllllll Mill MINIMUM IMM MMIMIMM 

orf 133-1 LNYSTNTVGYRFGGEYTGYYGSDDEFKRAFGENSPTYKKHCNRSCGIYEPVLKKYGKKRA 
480 490 500 510 520 530 



20 



530 540 550 560 570 580 

orf 133a . pep NNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTALKPERANTWQFGFN 

1 1 1 II II 1 1 M 1 1 II Ml 1 1 1 ! 1 1 1 1 M M I II I M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 III 1 1 

orf 133 - 1 NNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTALKPERANTWQFGFN 
540 550 560 570 580 590 



25 



590 600 610 620 630 640 

orf 133a. pep T YKKGLLKQDD I LGLKLVG YRSR I DX Y IHNVYGKWWDLNGN I PS WVSSTGLAYT I QHRNF 

1 1 1 F 1 1 1 1 1 1 1 MIMMIII 1 1 1 1 1 Ml II M I M 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 

orf 133-1 TYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVSSTGLAYTIQHRNF 
600 610 620 630 640 650 



30 



650 660 670 680 690 700 

orf 133a. pep KDKVHKHGFELELNYDYXRFFTNLSYAYQKSTQPTNFSDASESPNNASKEDQLKQGYGLS 

I I I I I I I II I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
orf 133 - 1 KDKVHKHGFELELNYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNASKEDQLKQGYGLS 

660 , 670 680 690 ' 700 710 



35 



710 720 730 . 740 750 760 

orf 133a . pep RVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDXTNGXXTSNFRQLG 

I Ml 1 1 1 II 1 1 1 1 II M 1 1 1 1 1 M 1 1 1 1 1 1 1 M I M 1 1 1 II III III III III 

orf 133-1 RVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKS IRATAEERYIDGTNGGNTSNFRQLG 

720 730 740 750 760 770 



40 



770 780 790 800 810 820 

or f 1 3 3 a . pep KRS IXQTETLARQPLI FDXYAAYEPKKXLI FRAEVKNLFDRRY IDPLDAGNDAATQRYYS 

1 1 1 1 Ml IMM IMM II 1 1 1 1 II 1 1 1 II 1 1 1 II M I M II 1 1 1 I II 

orf 133 - 1 KRS I KQTETLARQPL I FDFYAAYEPKKNL I FRAEVKNLFDRRY IDPLDAGNDAATQRYYS 

780 790 800 810 820 830 



45 



830 840 850 860 870 

orf 133a. pep S FDPKDKDEE VTCNDDNTLCNGKYGGTS KS VLTNFARGXTFL I TMS YKFX 

MMMIIMM M 1 1 1 1 1 1 1 1 II 1 1 1 Ml 1 1 1 IMMIIIM 

orf 133-1 SFDPKDKDEDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSYKFX 
840 850 860 870 880 



Homology with a predicted ORF from N. gonorrhoeae 
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ORF133 (SEP ID NO: 876) shows 92.3% identity over 392 aa overlap with a predicted ORF 
(ORF133ng) (SEP ED NO: 882) from N. gonorrhoeae: 

orf 133 .pep PGYYGSDDE FKRAFGENS PTXKKHCNRSCG I 31 

I I I hh I I I I I h I I h hlh |||: 
5 orf 133ng FYFDAALKKDIYRLNYSTNAINYRFGGEYTGYYGSENEFKRAFGENSPAYKEHCDPSCGL 560 

orf 133 .pep YEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTAL 91 

I I I I I I I I I II I I I I I I h 1 i I I I hi I I hi I I I I I I I h I I I I I I I I I I I I I I I I 
orf 133ng YEPVLKKYGKKRANNHSVSISADFGDYFMPFAGYSRTHRMPNIQEMYFSQIGDSGVHTAL 62 0 

orf 133 .pep KPERANTWQFGFXTYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVS 151 

10 Ml II I Mill I MM I III I II MM INN I MM IMIIIII M MM I llllh 

orf 133ng KPERANTWQFGFNT YKKGLLKQDD I LGLKLVGYRSR I DNY I HNVYGKWWDLNGD I PS WVG 680 

orf 133 .pep STGLAYTIQHRXFXDKVHQXXXXXXXXYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNA 211 

Illllllhll I lllh II I I hhh h II I III hlh I llllh 

orf 13 3ng STGLAYTIRHRNFKDKVHKHGFELELNYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNA 74 0 

15 orf 133 .pep SKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGANRYFGKSIRATAEERYIDG 271 

II I II I I I I I M I I II II I II I I I II I I I I I I I I I I I I I M I I I I II I M I I I I I I II II 
orf 1 3 3 ng SKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKS IRATAEERYIDG 800 

orf 133. pep TNGGNTSNFRQLGKRS I KQTETLARQPLIXDFNAAYEPKKNLI FRAEVKNLFDRRYIDPL 331 

Ml Ml II llllh 1 1 II III hi II I M II II II 1 1 1 1 1 M II 1 1 II II 1 1 II 1 1 

20 or f 1 3 3 ng TNGGNTSNVRQLGKRS I KQTETLARQPL I FDFYAAYEPKKNLI FRAEVKNLFDRRYIDPL 860 

or f 1 3 3 . pep DAGNDAAXERYYSSFDPKDKDXDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSY 3 91 

I I I M I I : : I I I I I I > I I I I I II II I I I II I I I I I II I I I I I M I I M I M I M I M I 
orf 133ng DAGNDAATQRYYSSFDPKDKDEDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSY 92 0 

orf 133. pep KF 393 

25 M 

orfl33ng KF 922 

The complete length ORF133ng nucleotide sequence [<SEQ ID 881 >] (SEP ID NO: 881) is 
predicted to encode a protein having amino acid sequence [<SEQ ID 882>] (SEP ID NP: 882) : 

30 1 MRSSFRLKPI CFYLMGVNLY HHSYAEDAGR AGSEAQIQVL EDVHVKAKRV 

51 PKDKKVFTDA RAVSTRQDVF KSGENLDNIV RS I PGAFTQQ DKSSGIVSLN 

101 IRGDSGFGRV NTMVDGITQT FYSTSTDAGR AGGSSQFGAS VDSNFIAGLD 

151 WKGSFSGSA GINSLAGSAN LRTLGVDDW QGNNTYGLLL KGLTGTNSTK 

201 GNAMAAIGAR KWLESGASVG VLYGHSRRGV AQNYRVGGGG QHIGNFGEEY 

35 251 LERRKQQYFV QEGGLKFNAG SGKWERDLQR QYWKTKWYKK YEDPQELQKY 

3 01 IEEHDKSWRE NLAPQYDITP IDPSGLKQQS AGNLLNLEYD GVFNKYTAQF 
351 RDLNTRIGSR KIINRNYQFN YGLSLNPYTN LNLTAAYNSG RQKYPKGAKF 

4 01 TGWGLLKDFE TYNNAKILDL NNTATFRLPR ETELQTTLGF NYFHNEYGKN 
451 RFPEELGLFF DGPDQDNGLY SYLGRFKGDK GLLPQKSTIV QPAGSQYFNT 

40 501 FYFDAALKKD IYRLNYSTNA INYRFGGEYT GYYGSENEFK RAFGENSPAY 

551 KEHCDPSCGL YEPVLKKYGK KRANNHSVSI SADFGDYFMP FAGYSRTHRM 

601 PNIQEMYFSQ IGDSGVHTAL KPERANTWQF GFNTYKKGLL KQDDILGLKL 

651 VGYRSRIDNY IHNVYGKWWD LNGDIPSWVG STGLAYTIRH RNFKDKVHKH 

701 GFELELNYDY GRFFTNLSYA YQKSTQPTNF SDASESPNNA SKEDQLKQGY 

45 751 GLSRVSALPR DYGRLEVGTR WLGN KLTLGG AMRYFGKS IR ATAEERYIDG 

801 TNGGNTSNVR QLGKRS I KQT ETLARQPLIF DFYAAYEPKK NL I FRAEVKN 
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851 LFDRRYIDPL DAGNDAATQR YYSSFDPKDK DEDVTCNADK TLCNGKYGGT 
901 SKSVLTNFAR GRTFLMTMSY KF* 

A variant was also identified, being encoded by the gonococcal DNA sequence [<SEQ ID 883>] 
(SEP ID NO: 883) : 



1 ATGAGATCTT CTTTCCGGTT GAAGCCGATT TGTTTTTATC TTATGGGTGT 

51 TATGCTATAT CATCATAGTT ATGCCGAAGA TGCAGGGCGC GCGGGCAGCG 

101 AGGCGCAGAT ACAGGTTTTG GAAGATGTGC ACGTCAAGGC GAAGCGCGTA 

151 CCGAAAGACA AAAAAGTGTT TACCGATGCG CGTGCCGTAT CGACCCGTca 

2 01 gGATGTGTTC AAATCCGGCG AAAACCTCGA CAACATCGTA CGCAGCATAC 

* 2 51 CCGGTGCGTT TACACAGCAA GATAAAAGCT CGGGCATTGT GTCTTTGAAT 

301 ATTCGCGGCG ACAGCGGGTT CGGGCGGGTC AATACGATGG TGGACGGCAT 

351 CACGCAGACC TTTTATTCGA CTTCTACCGA TGCGGGCAGG GCAGGCGGTT 

401 CATCTCAATT CGGTGCATCT GTCGACAGCA ATTTTATTGC CGGACTGGAT 

4 51 GTCGTCAAAG GCAGCTTCAG CGGCTCGGCA GGCATCAACA GCCTTGCCGG 

501 TTCGGCGAAT CTGCGGACTT TAGGCGTGGA TGACGTCGTT CAGGGCAATA 

551 ATACCTACGG CCTGCTGCTA AAAGGTCTGA CCGGCACCAA TTCAACCAAA 

601 GGTAATGCGA TGGCGGCGAT AGGTGCGCGC AAATGGCTGG AAAGCGGAGC 

651 GTCTGTCGGT GTGCTTTACG GGCACAGCAG GCGCGGCGTG GCGCAAAATT 

7 01 ACCGCGTGGG CGGCGGCGGG CAGCACATCG GAAATTTTGG TGAAGAATAT 

751 CTGGAACGGC GCAAACAGCA ATATTTTGTA CAAGAGGGTG GTTTGAAATT 

801 CAATGCCGGC AGCGGAAAAT GGGAACGGGA TTTGCAAAGG CAATACTGGA 

851 AAACAAAGTG GTATAAAAAA TACGAAGACC CCCAAGAACT GCAAAAATAC 

901 ATCGAAGAGC ATGATAAAAG CTGGCGGGAA AACCTGGCGC CGCAATACGA 

951 CATCACCCCC ATCGATCCGT CCGGCCTGAA GCAGCAGTCG GCAGGCAATC 

1001 TGTTTAAATT GGAATACGAC GGCGTATTCA ATAAATACAC GGCGCAATTT 

1051 CGCGATTTAA ACACCAGAAT CGGCAGCCGC AAAATCATCA ACCGCAATTA 

1101 TCAATTCAAT TACGGTTTGT CTTTGAACCC GTATACCAAC CTCAATCTGA 

1151 CCGCAGCCTA CAATTCGGGC AGGCAGAAAT ATCCGAAAGG GGCGAAGTTT 

12 01 ACAGGCTGGG GGCTTTTAAA AGATTTTGAA ACCTACAACA ACGCGAAAAT 

1251 CCTCGACCTC AACAACACCG CCACCTTCCG GCTGCCCCGC GAAACCGAGT 

1301 TGCAAACCAC TTTGGGCTTC AATTATTTCC ACAACGAATA CGGCAAAAAC 

1351 CGCTTTCCTG AAGAATTGGG GCTGTTTTTC GACGGTCCTG ATCAGGACAA 

14 01 CGGGCTTTAT TCCTATTTGG GGCGGTTTAA GGGCGATAAA GGGCTGTTGC 

1451 CTCAAAAATC AACCATTGTC CAACCGGCCG GCAGCCAATA TTTCAACACG 

1501 TTCTACTTCG ATGCCGCGCT CAAAAAAGAC ATTTACCGCT TAAACTACAG 

1551 CACCAATGCA ATCAACTACC GTTTCGGCGG CGAATATACG GGCTATTACG 

1601 GCTCGGAAAA CGAATTTAAG CGGGCATTCG GAGAAAACTC GCCGGCATAC 

1651 AAGGAACATT GCGACCCGAG CTGCGGGCTT TATGAACCCG TATTGAAAAA 

1701 ATACGGCAAA AAGCGCGCCA ACAACCATTC GGTCAGCATT AGTGCGGACT 

1751 TCGGCGATTA TTTCATGCCG TTCGCCGGCT ATTCGCGCAC ACACCGTATG 

1801 CCCAACATCC AAGAAATGTA TTTTTCCCAA ATCGGCGACT CCGGCGTTCA 

1851 CACCGCCTTA AAAC CAGAGC GCGCAAACAC TTGGCAATTT GGCTTCAATA 

1901 CCTATAAAAA AGGATTGTTA AAACAAGATG ATATATTAGG ATTGAAACTG 

1951 GTCGGCTACC GCAGCCGCAT TGACAACTAC ATCCACAACG TTTACGGGAA 

2001 ATGGTGGGAT TTGAACGGGG ATATTCCGAG CTGGGTCGGC AGCACCGGGC 

2051 TTGCCTACAC CATCCGACAC CGCAATTTCA AAGACAAAGT GCACAAACAC 

2101 GGTTTTGAGC TGGAGCTGAA TTACGATTAT GGGCGTTTTT TCACCAACCT 

2151 TTCTTACGCC TATCAAAAAA GCACGCAACC GACCAATTTC AGCGATGCGA 

2201 GCGAATCGCC CAACAATGCC tccaaAGAAG ACCAACTCAA ACAAGGTTAT 

2251 GGGCTGAGCA GGGTTTCCGC . CCTGCCGCGA GATTACGGAC GTTTGGAAGT 

2301 CGGTACGCGC TGGTTGGGCA ACAAACTGAC TTTGGGCGGC GCGAtgcGCT 

23 51 ATTTCGGCAA GAGCATCCGC GCGACGGCTG AAGAACGCTA TATCGACGGC 

24 01 ACCAACGGGG GAAATACCAG CAATGTCCGG CAACTGGGCA AGCGTTCCAT 
24 51 CAAACAAACC GAAACCCTTG CCCGACAGCC TTTGATTTTT GATTTTTACG 
2501 CCGCTTACGA GCCGAAGAAA AACCTTATTT TCCGCGCCGA AGTCAAAAAC 
2 551 CTGTTCGACA GGCGTTATAT CGATCCGCTC GATGCGGGCA ATGATGCGGC 
2601 AACGCAGCGT TATTACAGCT CGTTCGACCC GAAAGACAAG GACGAAGACG 
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2651 TAACGTGTAA TGCTGATAAA ACGTTGTGCA ACGGCAAATA CGGCGGCACA 

2701 AGCAAAAGCG TATTGACCAA TTTCGCACGC GGACGCACCT TCTTGATGAC 

2751 GATGAGCTAC AAGTTTTAA 

5 This corresponds to the amino acid sequence [<SEQ ID 884; ORF133ng-l>] (SEP ID NO: 884; 
ORF133ng-n : 

1 MRSSFRLKPI CFYLMGVMLY HHSYA EDAGR AGSEAQIQVL EDVHVKAKRV 

51 PKDKKVFTDA RAVSTRQDVF KSGENLDNIV RSIPGAFTQQ DKSSGIVSLN 

101 IRGDSGFGRV NTMVDGITQT FYSTSTDAGR AGGSSQFGAS VDSNFIAGLD 

10 151 WKGSFSGSA GINS LAGS AN LRTLGVDDW QGNNTYGLLL KGLTGTNSTK 

2 01 GNAMAAIGAR KWLESGASVG VLYGHSRRGV AQNYRVGGGG QHIGNFGEEY 

2 51 LERRKQQYFV QEGGLKFNAG SGKWERDLQR QYWKTKWYKK YEDPQELQKY 

3 01 IEEHDKSWRE NLAPQYDITP IDPSGLKQQS AGNLFKLEYD GVFNKYTAQF 

3 51 RDLNTRIGSR KIINRNYQFN YGLSLNPYTN LNLTAAYNSG RQKYPKGAKF 
15 4 01 TGWGLLKDFE TYNNAKILDL NNTATFRLPR ETELQTTLGF NYFHNEYGKN 

4 51 RFPEELGLFF DGPDQDNGLY SYLGRFKGDK GLLPQKSTIV QPAGSQYFNT 
501 FYFDAALKKD IYRLNYSTNA INYRFGGEYT GYYGSENEFK RAFGENSPAY 
551 KEHCDPSCGL YEPVLKKYGK KRANNHSVSI SADFGDYFMP FAGYSRTHRM 
601 PNIQEMYFSQ IGDSGVHTAL KPERANTWQF GFNTYKKGLL KQDDILGLKL 

20 651 VGYRSRIDNY IHNVYGKWWD LNGDIPSWVG STGLAYTIRH RNFKDKVHKH 

701 GFELELNYDY GRFFTNLSYA YQKSTQPTNF SDASESPNNA SKEDQLKQGY 

751 GLSRVSALPR DYGRLEVGTR WLGNKLTLGG AMRYFGKSIR ATAEERYIDG 

801 TNGGNTSNVR QLGKRSIKQT ETLARQPLIF DFYAAYEPKK NL I FRAEVKN 

851 LFDRRYIDPL DAGNDAATQR YYSSFDPKDK DEDVTCNADK TLCNGKYGGT 

25 901 SKSVLTNFAR GRTFLMTMSY KF* 

ORF133ng-l (SEP ID NO: 884) and ORF133-1 (SEP ID NP: 878) show 96.2% identity in 889 aa 
overlap: 



10 20 30 40 50 60 

30 orf 133ng-l .pep S FRLKP I CFYLMGVMLYHHS YAEDAGRAGSEAQ I QVLEDVHVKAKRVPKDKKVFTDARAV 

I I II I M I I I I I I I I I I I I I I I I I I I I I 
orfl33-l EAQ I QVLEDVHVKAKRVPKDKKVFTDARAV 

10 20 30 

70 80 90 100 110 120 

35 orf 133ng-l .pep STRQDVFKSGENLDNIVRSIPGAFTQQDKSSGIVSLNIRGDSGFGRVNTMVDGITQTFYS 

I I I I : I M :. I I I I I i I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I 
orf 133-1 STRQDIFKSSENLDNIVRSIPGAFTQQDKSSGIVSLNIRGDSGFGRVNTMVDGITQTFYS 

40 50 60 70 80 90 

130 140 150 160 170 180 

40 orf 133ng-l .pep TSTDAGRAGGSSQFGASVDSNFIAGLDWKGSFSGSAGINSLAGSANLRTLGVDDWQGN 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 U 1 1 1 1 1 1 1 1 1 1 1 

orf 133 - 1 TSTDAGRAGGSSQFGASVDSNFIAGLDWKGSFSGSAGINSLAGSANLRTLGVDDWQGN 

100 110 120 130 140 150 

190 200 210 220 230 240 

45 orf 133ng-l .pep NTYGLLLKGLTGTNSTKGNAMAA I GARKWLESGAS VGVLYGHSRRGVAQNYRVGGGGQH I 

MINIM llllllllllll llllllllllll II II IMMM llllllllllll 
or f 1 3 3 - 1 NTYGLLLKGLTGTNSTKGNAMAAIGARKWLESGAS VGVLYGHSRRS VAQNYRVGGGGQH I 

160 170 180 190 200 210 

250 260 270 280 290 300 

50 orf 133ng-l .pep GNFGEEYLERRKQQYFVQEGGLKFNAGSGKWERDLQRQYWKTKWYKKYEDPQELQKYIEE 
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it 1 1 1 1 1 1 1 1 1 i * 1 1 1 1 1 1 * 1 1 ii * 1 1 1 1 1 1 1 i i * i • * ii ii ii 1 1 1 

orf 13 3 - 1 GNFGAEYLERRKQRYFVQEGALKFNSDSGKWERDLQRQQWKYKPYKNYNN- QELQKYIEE 

220 230 240 250 260 

310 320 330 340 350 360 

5 orf 133ng-l .pep HDKSWRENLAPQYDITPIDPSGLKQQSAGNLFKLEYDGVFNKYTAQFRDLNTRIGSRKII 

MINI I I I ! I I I I i I : I I I I I I I I 1 I I M II I I I I I M 1 1 1 1 1 hll I M I I 
orf 133 - 1 HDKSWRENLXPQYDITPIDPSSLKQQSAGNLFKLEYDGVFNKYTAQFRDLNTKIGSRKII 
270 280 290 300 310 320 

370 380 390 400 410 420 

1 0 or f 13 3ng- 1 . pep NRNYQFNYGLSLNPYTNLNLTAAYNSGRQKYPKGAKFTGWGLLKDFETYNNAKILDLNNT 

I II | I II I I I! I I ' I I I I I I I II I ' I I I I M Ihl I I I I I I I I I I I I I I I II I I I I I I 
orf 133-1 NRNYQFNYGLSLNPYTNLNLTAAYNSGRQKYPKGSKFTGWGLLKDFETYNNAKILDLNNT 
330 340 350 360 370 380 

430 440 450 460 470 480 . 

15 orf 133ng-l .pep ATFRLPRETELQTTLGFNYFHNEYGKNRFPEELGLFFDGPDQDNGLYSYLGRFKGDKGLL 

MINI MM II Mill MM MM I II 1 1 1! 1 1 II Mill MINI M III II II II II 

orf 133 - 1 ATFRLPRETELQTTLGFNYFHNEYGKNRFPEELGLFFDGPDQDNGLYSYLGRFKGDKGLL 
390 400 410 420 430 440 

490 500 510 520 530 540 

20 orf 133ng-l .pep PQKSTIVQPAGSQYFNTFYFDAALKKDIYRLNYSTNAINYRFGGEYTGYYGSENEFKRAF 

I I I I I I I I I I I I I I I I I I I I I I I I ■ I I I I I I I I I i I I I I I I I I I I I -I I I I I I 

orf 133-1 PQKSTIVQPAGSQYFNTFYFDAALKKDIYRLNYSTNTVGYRFGGEYTGYYGSDDEFKRAF 
450 460 470 480 490 500 

550 560 570 580 590 600 

25 orf 133ng-l .pep GENSPAYKEHCDPSCGLYEPVLKKYGKKRANNHSVSISADFGDYFMPFAGYSRTHRMPNI 

! I I i : I ! : I I : || h I I II I I I I I M I I I M M II I I II I M I I I I h II I I II M I I 
orf 133-1 GENSPTYKKHCNRSCGIYEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNI 
510 520 530 540 550 560 

610 620 630 640 650 660 

30 orf 133ng-l .pep QEMYFSQIGDSGVHTALKPERANTWQFGFNTYKKGLLKQDDILGLKLVGYRSRIDNYIHN 

I I II I I I M I I I I M II II I I I I II I I I I I II I I II I I I M I I M I I I I I I I I I I I I I 
orf 133 - 1 QEMYFSQIGDSGVHTALKPERANTWQFGFNTYKKGLLKQDDTLGLKLVGYRSRIDNYIHN 
570 580 590 600 610 620 

670 680 690 700 710 720 

35 orf 133ng-l .pep VYGKWWDLNGD I PSWVGSTGLAYTI RHRNFKDKVHKHGFELELNYDYGRFFTNLSYAYQK 

i M I I I I I I I I I I hi I I I M I I : I I I I I I I I I I I I M I I I I I I II I M I I I I I I I I I 
orf 133-1 VYGKWWDLNGD I PSWVSSTGLAYT I QHRNFKDKVHKHGFELELNYDYGRFFTNLSYAYQK 

630 640 650 660 670 680 

730 740 750 760 770 780 

40 orf 133ng-l .pep STQPTNFSDASESPNNASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMR 

I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I M II I I U I I I I I I I I H I I I I I I I I I 
orf 133 - 1 STQPTNFSDASESPNNASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMR 
690 700 710 720 730 740 

790 800 810 820 830 840 

45 orf 133ng-l .pep YFGKS I RATAEERY IDGTNGGNTSNVRQLGKRS I KQTETLARQPL I FDFYAAYE PKKNL I 

II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I 

orfl33-l YFGKSIRATAEERYIDGTNGGNTSNFRQLGKRSIKQTETLARQPLIFDFYAAYEPKKNLI 
750 760 770 780 790 800 
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850 860 870 880 . 890 900 

orf 133ng-l.pep FRAEVKNLFDRRYIDPLDAGNDAATQRYYSSFDPKDKDEDVTCNADKTLCNGKYGGTSKS 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
orf 133 - 1 FRAEVKNLFDRRYIDPLDAGNDAATQRYYSSFDPKDKDEDVTCNADKTLCNGKYGGTSKS 
5 810 820 830 840 850 860 

910 920 
orf 133hg-l .pep VLTNFARGRTFLMTMSYKFX 

llllllllll Mllllll 
orf 133 - 1 VLTNFARGRTFLMTMSYKFX 
10 870 880 

In addition, ORF133ng-l (SEP ID NO: 884) is homologous to a TonB-dependent receptor (SEP 
ID NO: 1167) in HJnfluenzae: 



15 



20 



25 



30 



35 



40 



sp | P4 5114 | YC17_HAEIN PROBABLE TONB - DEPENDENT RECEPTOR HI1217 PRECURSOR 
)gi| 1075372 |pir | |G64110 transferrin binding protein 1 precursor (tbpl) homolog - 
Haemophilus influenzae (strain Rd KW20) )gi| 1574147 (U32801) transferrin binding 
protein 1 precursor (tbpl) [Haemophilus influenzae] Length = 913 
Score = 930 bits (2377), Expect =0.0 

Identities = 476/921 (51%), Positives = 619/921 (66%), Gaps = 72/921 (7%) 

Query: 3 8 QVLEDVHVKAKRVPKDKKVFTDARAVSTRQDVFKSGENLDNI VRS I PGAFTQQDKSSGI V 97 

+ L + V K + DKK FT+A+A STR++VFK + +D + +RS I PGAFTQQDK SG+V 
Sbjct: 29 ETLGQ I DWEKVI SNDKKP FTEAKAKSTRENVFKETQT I DQV IRS I PGAFTQQDKGSGW 88 

Query : 98 SLNIRGDSGFGRVNTMVDGITQTFYSTSTDAGRAGGSSQFGASVDSNFIAGLDWKGSFS 157 

S+NIRG++G GRVNTMVDG+TQTFYST+ D+G++GGSSQFGA++D NFIAG+DV K +FS 
Sbjct : 8 9 SVNIRGENGLGRVNTMVDGVTQTFYSTALDSGQSGGSSQFGAAIDPNFIAGVDVNKSNFS 14 8 

Query: 158 GSAGINSLAGSANLRTLGVDDVVQXXXXXXXXXXXXXXXXXXXXXAMAAIGARKWLESGA 217 

G++GIN+LAGSAN RTLGV+DV+ M RKWL++G 

Sbjct: 149 GASGINALAGSANFRTLGVNDVITDDKPFGIILKGMTGSNATKSNFMTMAAGRKWLDNGG 208 

Query: 218 SVGVLYGHSRRGVAQNYRVGGGGQHIGNFGEEYLERRKQQYFVQEGGLKFNAGSGKWERD 277 

VGV+YG+S+R V+Q+YR+ GGG+ + + G++ L + K+ YF + G N G+W D 
Sbjct: 209 Y VG WYG YS QRE VS QD YR I - GGGERLAS LGQD I LAKE KE A Y F - RNAG Y I LN P - EGQ WT PD 265 

Query: 278 LQRQYWK TKWY KKYEDPQELQK YIEE 303 

L +++W +Y KK +D ++LQK IEE 

Sbjct: 266 LS KKHWS CNKPDYQKNGDCS Y YRIGS AAKTRRE I LQELLTNGKKPKD I EKLQKGNDG I EE 325 

Query: 304 HDKSWRENLAPQYDITPIDPSGLKQQSAGNLFKLEYDGVFNKYTAQFRDLNTRIGSRKII 363 

DKS+ N QY + PI+P L+ +S +L K EY AQ R L+ +IGSRKI 

Sbjct: 326 TDKSFERN-KDQYSVAPIEPGSLQSRSRSHLLKFEYGDDHQNLGAQLRTLDNKIGSRKIE 384 

Query: 364 NRNYQFNYGLSLNPYTNLNLTAAYNSGRQKYPKGAKFTGWGLLKDFETYNNAKILDLNNT 423 

NRNYQ NY + N Y +LNL AA+N G+ YPKG F GW + T N A I+D+NN+ 

Sbjct: 385 NRNYQVN YNFNNNS YLDLNLMAAHN I GKT I Y P KGG F F AGWQVADKL I T KNVAN I VD I NNS 444 

Query: 424 ATFRLPRETELQTTLGFNYFHNEYGKNRFPEELGLFFDGPDQDNGLYSY- - LGRFKGDKG 481 

TF LP+E +L+TTLGFNYF NEY KNRFPEEL LF++ D GLYS+ GR+ G K 
Sbjct: 445 HTFLLPKEIDLKTTLGFNYFTNEYSKNRFPEELSLFYNDASHDQGLYSHSKRGRYSGTKS 504 



45 



Query: 4 82 
Sbjct: 505 



LLPQKSTIVQPAGSQYFNTFYFDAALKKDI YRLNYSTNAINYRFGGEYTGYYGSENEFKR 54 1 
LLPQ+S I+QP+G Q F T YFD AL K IY LNYS N +Y F GEY GY 
LLPQRSVILQPSGKQKFKTVYFDTALSKGIYHLNYSVNFTHYAFNGEYVGY 555 
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Query: 542 AFGENSPAYKEHCDPSCGLYEPVLKKYGKKRANNHSVSISADFGDYFMPFAGYSRTHRMP 601 

EN+ + + EP+L K G K+A NHS ++SA+ DYFMPF YSRTHRMP 

Sbjct: 556 ENTAGQQ I NEP I LHKSGHKKAFNHSATLSAELSDYFMPFFT YSRTHRMP 604 

Query: 602 NIQEMYFSQIGDSGVHTALKPERANTWQFGFNTYKKGLLKQDDILGLKLVGYRSRIDNYI 661 

NIQEM+FSQ+ ++GV+TALKPE+ ++T+Q GFNTYKKGL QDD + LG+ KLVGYRS I NY I 
Sbjct: 605 N I QEMFFSQVSNAGVNTALKPEQSDTYQLGFNTYKKGLFTQDDVLGVKLVGYRS F I KNY I 664 

Query: 662 HNVYGKVWDLNGDIPSOTGSTGLAYTIRHRNFKDKVHKHGFELELNYDYGRFFTNLSYAY 721 

HNVYG WW +P+W S G YTI H+N+K V K G ELE+NYD GRFF N+SYAY 

Sbjct: 665 HNVYGVWW- -RDGMPTWAESNGFKYTIAHQNYKPIVKKSGVELEINYDMGRFFANVSYAY 722 

Query: 722 QKSTQPTNFSDASESPNNASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGA 781 

Q++ QPTN++DAS PNNAS+ED LKQGYGLSRVS LP+DYGRLE+GTRW KLTLG A 
Sbjct: 723 QRTNQPTNYADASPRPNNASQEDILKQGYGLSRVSMLPKDYGRLELGTRWFDQKLTLGLA 782 

Query: 782 MRYFGKSIRATAEERYIDGTNGGNTSNVRQLGKRSIKQTETLARQPLIFDFYAAYEPKKN 841 

RY+GKS RAT EE YI+G+ + +R+ ++K+TE + +QP+I D + +YEP K+ 

Sbjct: 783 ARYYGKSKRATIEEEYINGSR-FKKNTLRRENYYAVKKTEDIKKQPIILDLHVSYEPIKD 841 

Query: 842 L I FRAEVKNLFDRRY IDPLDAGNDAATQRYYS S FDPKDKDEDVTCNADKTLCNGKYGGTS 901 

LI +AEV+NL D+RY+DPLDAGNDAA+QRYYSS + + C D + C GG+ 
Sbjct: 842 LIIKAEVQNLLDKRYVDPLDAGNDAASQRYYSSL NNS I ECAQDSSAC GGSD 892 

Query: 902 KSVLTNFARGRTFLMTMSYKF 922 

K+VL NFARGRT++++++YKF 
Sbjct: 893 KTVLYNFARGRTYILSLNYKF 913 

The underlined motif in the gonococcal protein (also present in the meningococcal protein) is 
predicted to be an ATP/GTP-binding site motif A (P-loop), and the analysis suggests that these 
proteins from N. meningitidis and N. gonorrhoeae, and their epitopes, could be useful antigens for 
vaccines or diagnostics, or for raising antibodies. 



Example 104 

The following partial DNA sequence was identified in N. meningitidis [<SEQ ID 885>] (SEP ID 
NO: 885) 

1 ATGAACCTGA TTTCACGTTA CATCATCCGT CAAATGGCGG TTATGGCGGT 

51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT 

101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGCTG 

151 GGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TGATTCCCCT 

201 CGCCGTCCTT ATCGGCGGAC TGGTCTCCCT CAGCCAGCTT GCCGCCGGCA 

251 GCGAACTGAC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG 

301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT 

351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG 

4 01 CCGCCGCCAT CAACGGCAAA ATCAGCACCG GCAATACCGG CCTTTGGCTG 

451 AAAGAAAAAA ACAGCGTGAT CAATGTGCGC GAAATGTTGC CCGACCAT . . 

This corresponds to the amino acid sequence [<SEQ ID 886; ORF112>] fSEO ID NO: 886; 



ORF112) : 
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1 MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEML 

51 GYTALKMPAR AYE LIPLAVL IGGLVSLSQL AAGSELTVIK ASGMSTKKLL 

101 LILSQFGFIF AIATVA LGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL 

151 KEKNSVINVR EMLPDH . . . 

Further work revealed further partal nucleotide sequence [<SEQ ID 887>] (SEP ID NO: 887) : 



1 ATGAACCTGA TTTCACGTTA CATCATCCGT CAAATGGCGG TTATGGCGGT 

51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT 

101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGCTG 

151 gGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TGATTCCCCT 

2 01 CGCCGTCCTT ATCGGCGGAC TGGTCTCCCT CAGCCAGCTT GCCGCCGGCA 
251 GCGAACTGAC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG 
301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT 
351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG 
4 01 CCGCCGCCAT CAACGGCAAA ATCAGCACCG GCAATACCGG CCTTTGGCTG 
4 51 AAAGAAAAAA ACAGCrTkAT CAATGTGCGC GAAATGTTGC CCGACCATAC 
501 GCTTTTGGGC ATCAAAATTT GGGCGCGCAA CGATAAAAAC GAATTGGCAG 
551 AGGCAGTGGA AGCCGATTCC GCCGTTTTGA ACAGCGACGG CAGTTGGCAG 
601 TTGAAAAACA TCCGCCGCAG CACGCTTGGC GAAGACAAAG TCGAGGTCTC 
651 TATTGCGGCT GAAGAAAACT GGCCGATTTC CGTCAAACGC AACCTGATGG 
701 ACGTATTGCT CGTCAAACCC GACCAAATGT CCGTCGGCGA ACTGACCACC 
751 TACATCCGCC ACCTCCAAAA CAACAGCCAA AACACCCGAA TCTACGCCAT 
801 CGCATGGTGG CGCAAATTGG TTTACCCCGC CGCAGCCTGG GTGATGGCGC 
851 TCGTCGCCTT TGCCTTTACC CCGCAAACCA CCCGCCACGG CAATATGGGC 
901 TTAAAACTCT TCGGCGGCAT CTGTsTCGGA TTGCTGTTCC ACCTTGCCGG 
951 ACGGCTCTTT GGGTTTACCA GCCAACTCGG. . . 

This corresponds to the amino acid sequence [<SEQ ID 888; ORF112-l>] (SEP ID NO: 888; 
ORF112-1) : 

1 MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEML 

51 GYTALKMPAR A YE LIPLAVL IGGLVSLSQL AAGSELTVIK ASGMSTKKLL 

101 LILSQFGFIF AIATVA LGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL 

151 KEKNSXINVR EMLPDHTLLG IKIWARNDKN ELAEAVEADS AVLNSDGSWQ 

201 LKNIRRSTLG EDKVEVSIAA EENWPISVKR NLMDVLLVKP DQMSVGELTT 

251 YIRHLQNNSQ NTRIYAIAWW R KLVYPAAAW VMALVAFAF T PQTTRHGNMG 

3 01 LKLFGGICXG LLFHLA GRLF GFTSQL . . . 

Computer analysis of this amino acid sequence predicts two transmembrane domains and gave the 
following results: 

Homology with a predicted ORF from N. meningitidis (strain A) 

ORF1 12 (SEP ID NO: 886) shows 96.4% identity over a 166aa overlap with an PRF (ORF1 12a) 
(SEP ID NO: 890) from strain A of N. meningitidis: 

10 20 30 40 50 60 

orf 112 .pep MNL I SRY 1 1 RQMAVMAVYALLAFLALYS FFE I LYETGNLGKGSYG I WEMLGYTALKMPAR 

I II II III MM II I MM III II II III II II Ml II II I II II II 1 1 II III II II 

O r f 1 1 2 a MNL I S RY 1 1 RQMAVMAVYALLAFLALYS FFEI LYETGNLGKGSYG I WEMXGYTALKMXAR 

10 20 30 40 50 60 
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70 80 90 100 110 120 

orf 112 . pep AYEL I PLAVL I GGLVSLSQLAAGSELTV I KASGMSTKKLLLI LSQFGF I FAI ATVALGEW 

Illhlllllllllll IMIIIIMII IMMIIIMIII IIIIUIMIMI II 

orf 112a AYE LMPLAVL I GGLVSXSQLAAGSELXV I KASGMSTKKLLLI LSQFGF I FAI ATVALGEW 

5 70 80 90 100 110 120 

130 140 150 160 

orf 112 . pep VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSVINVREMLPDH 

MIIIMIIMMMIIIIII MINIM MIMII INN 

or f 1 1 2 a VAPTLSQKAENI KAAAINGKI STGNTGLWLKEKNS I INVREMLPDHTLLGI KI WARNDKN 

10 130 140 150 160 170 180 

orf 112a ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEEXWPISVKRNLMDVLLVKP 

190 200 210 220 230 240 

The ORF1 12a nucleotide sequence [<SEQ ID 889>] (SEP ID NO: 889) is: 

15 1 ATGAACCTGA TTTCACGTTA CATCATCCGT CAAATGGCGG TTATGGCGGT 

51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT 

101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGNTG 

151 GGNTACACCG CCCTCAAAAT GNCCGCCCGC GCCTAGGAAC TGATGCCCCT 

201 CGCCGTCCTT ATCGGCGGAC TGGTCTCTNT CAGCCAGCTT GCCGCCGGCA 

20 251 GCGAACTGAN CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG 

301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT 

351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG N 

4 01 CCGCGGCCAT CAACGGCAAA ATCAGTACCG GCAATACCGG CCTTTGGCTG 

4 51 AAAGAAAAAA ACAGCATTAT CAATGTGCGC GAAATGTTGC CCGACCATAC 

25 501 CCTGCTGGGC ATTAAAATCT GGGCCCGCAA CGATAAAAAC GAACTGGCAG 

551 AGGCAGTGGA AGCCGATTCC GCCGTTTTGA ACAGCGACGG CAGTTGGCAG 

601 TTGAAAAACA TCCGCCGCAG CACGCTTGGC GAAGACAAAG TCGAGGTCTC 

651 TATTGCGGCT GAAGAAAANT GGCCGATTTC CGTCAAACGC AACCTGATGG 

701 ACGTATTGCT CGTCAAACCC GACCAAATGT CCGTCGGCGA ACTGACCACC 

30 751 TACATCCGCC ACCTCCAAAN NNACAGCCAA AACACCCGAA TCTACGCCAT 

801 CGCATGGTGG CGCAAATTGG TTTACCCCGC CGCAGCCTGG GTGATGGCGC 

851 TCGTCGCCTT TGCCTTTACC CCGCAAACCA CCCGCCACGG CAATATGGGC 

901 TTAAAANTCT TCGGCGGCAT CTGTCTCGGA TTGCTGTTCC ACCTTGCCGG 

951 NCGGCTCTTC NGGTTTACCA GCCAACTCTA CGGCATCCCG CCCTTCCTCG 

35 1001 NCGGCGCACT ACCTACCATA GCCTTCGCCT TGCTCGCCGT TTGGCTGATA 

1051 CGCAAACAGG AAAAACGCTA A 

This encodes a protein having the amino acid sequence [<SEQ ID 890>] (SEP ID NO: 890) : 



1 MNLISRYIIR QMAVMAVYAL LAFLALYSFF E I LYETGNLG KGSYGIWEMX 

40 51 GYTALKMXAR A YE LMPLAVL IGGLVSXSQ L AAGSELXVIK ASGMSTKKLL 

101 LILSQFGFIF A I ATVA LGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL 

151 KEKNSI INVR EMLPDHTLLG I KI WARNDKN ELAEAVEADS AVLNSDGSWQ 

201 LKNIRRSTLG EDKVEVSIAA EEXWPISVKR NLMDVLLVKP DQMSVGELTT 

2 51 YIRHLQXXSQ NTRIYAIAWW R KLVYPAAAW VMALVAFAF T PQTTRHGNMG 

45 3 01 LKXFGGICLG LLFHLA GRLF XFTSQLYGIP PFLXGALPTI AFALLAVWLI 

351 RKQEKR* 

ORF112a (SEP ID NO: 890) and ORF112-1 fSEO ID NO: 888) show 96.3% identity in 326 aa 
overlap: 



50 



orf 112a . pep MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMXGYTALKMXAR 

MIIIIIIIIIMIMIIIIIIIIIIIIIIIIIIIIIIIIIMIIMII MIMII II 
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orfll2-l MNL I SRY 1 1 RQMAVMAVYALLAFLALYS FFE I LYETGNLGKGS YGI WEMLGYTALKMPAR 

orf 112a. pep AYELMPLAVLIGGLVSXSQLAAGSELXVI KASGMSTKKLLL I LSQFGF I FAIATVALGEW 

lllhlllllllMII 1 1 II I Nihil II Mill III II I II II II I II II III II II 

orf 112-1 AYEL I PLAVL IGGLVSLSQLAAGS ELTVI KASGMSTKKLLL I LSQFGF I FAIATVALGEW 

5 orf 112a .pep VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSIINVREMLPDHTLLGIKIWARNDKN 

1 1 1 1 1 1 1 1 M 1 1 II I M II I II 1 1 1 1 1 1 1 M 1 1 1 1 Ml MINIMI MINIMI I 

orf 112-1 VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSXINVREMLPDHTLLGIKIWARNDKN 
orf 112a . pep ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEEXWPISVKRNLMDVLLVKP 

1 1 1 1 1 II II 1 1 1 M I II 1 1 II I II I II 1 1 M II I M M I II I lllllllllllllllll 

1 0 orf 112 - 1 ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEENWPISVKRNLMDVLLVKP 

orf 112a .pep ' DQMSVGELTTYIRHLQXXSQNTRIYAIAWWRKIjVYPAAAWVMALVAFAFTPQTTRHGNMG 

I I I I I I I I II I I I I I I I II I I I I I I I M I I II I I I I I I II II I I M I I I M I I I I I I I 
or f 1 12 - 1 DQMSVGELTTYIRHLQNNSQNTRI yai awwrklvypaaawvmalvafaftpqttrhgnmg 

orf 112a . pep lkxfggiclgllfhlagrlfxftsqlygippflxgalptiafallavwlirkqekrx 

15 M Mill IMIMMIM Mill 

orf 112-1 lklfggicxgllfhlagrlfgftsql 

Homology with a predicted ORF from N. gonorrhoeae 

ORF112 fSEO ID NO: 886) shows 95.8% identity over 166aa overlap with a predicted ORF 
(ORF1 12ng) (SEP ID NO: 892) from N. gonorrhoeae: 



20 orf 112 .pep mnlisryiirqmavmavyallaflalysffeilyetgnlgkgsygiwemlgytalkmpar 60 

I M 1 1 1 M 1 1 N 1 1 1 1 1 1 1 1 1 1 1 II I II M I M 1 1 1 1 N 1 1 1 1 N 1 1 1 1 1 1 1 1 1 N 1 1 1 

orf 112ng MNL I SRY I I RQMAVMAVYALLAFLALYS FFE I LYETGNLGKGS YGI WEMLGYTALKMPAR 60 

orf 112 .pep AYEL I PLAVL IGGLVSLSQLAAGS ELTVI KASGMSTKKLLL I LSQFGF I FAIATVALGEW 120 

II I h 1 1 1 II II I h I I II 1 1 I M I h I II II I 1 1 1 1 1 I 1 1 1 II 1 1 1 1 1 1 II h 1 1 II II 

25 orf 112ng AYELM PLAVL I GGLASLSQLAAGSELAV I KASGMSTKKLLL I LSQFGF I FA I AAVALGEW 12 0 

orf 112 .pep VAPTLSQKAENI KAAAINGKI STGNTGLWLKEKNSVINVREMLPDH 166 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I : I I I I I I I I I - 
orfll2ng VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKTSIINVRGMLPDHTLLGIKIWARNDKN 180 

30 The complete length ORF1 12ng nucleotide sequence [<SEQ ID 89 1>] fSEOIDNO: 891) is: 

1 ATGAACCTGA TTTCACGTTA CATCATCCGC CAAATGGCGG TTATGGCGGT 

51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT 

101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGCTG 

151 GGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TCATGCCCCT 

35 2 01 CGCCGTCCTC ATCGGCGGAC TGGCCTCTCT CAGCCAGCTT GCCGCCGGCA 

251 GCGAACTGGC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG 

301 TTGATTCTGT CTCAGTTCGG TTTTATTTTT GCTATTGCCG CCGTCGCGCT 

351 CGGCGAATGG GTTGCGCCCA CGCTGAGCCA AAAAGCCGAA AACATCAAag 

4 01 cCGCCGCCAt taacggCAAA ATCAGCAccg gcAATACCGG CCTTTggcTG 

40 451 AAAGAAAAAa CCAGCATTAT CAATGTGcGc GGAATGTTGC CCGACCATAC 

501 GCTTTTGGGC ATCAAAATTT GGGCGCGCAA CGATAAAAAC GAATTGGCAG 

551 AGGCAGTGGA AGCCGATTCC GCCGTTTTGA ACAGCGACGG CAGCTGGCAG 

601 TTGAAAAACA TCCGCCGCAG CATCATGGGT ACAGACAAAA TCGAAACATC 

651 cgCCGCCGCC GAAGAAACTT gGCCGATTGC CGTCAGACGC AACCTGATGG 
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701 ACGTATTGCT CGTCAAGCCC 

751 TACATCCGCC ACCTCCAAAA 

801 CGCATGGTGG CGTAAACTCG 

851 TCGTTGCCTT CGCCTTTACG 

5 901 TTAAAACTCT TCGGCGGCAT 

951 CAGGCTCTTC GGGTTTACCA 

1001 CCGGCGCACT GCCTACCATA 

1051 CGCAAACAGG AAAAACGTTG 



GACCAAATGT CCGTCGGCGA GCTGACCACC 
CAACAGCCAA AACACCCAAA TCTACGCCAT 
TTTACCCCGT CGCCGCATGG GTCATGGCGC 
CCGCAAACCA CGCGCCACGG CAATATGGGC 
CTGTCTCGGA TTGCTGTTCC ACCTTGCCGG 
GCCAACTCTA CGGCACCCCA CCCTTCCTCG 
GCCTTCGCCT TGCTCGCTGT TTGGCTGATA 
A 



10 This encodes a protein having amino acid sequence [<SEQ ID 892>] (SEP ID NO: 892) : 

1 MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEML 

51 GYTALKMPAR A YE LMPLAVL IGGLASLSQL AAGSELAVIK ASGMSTKKLL 

101 LILSQFGFIF AIAAVA LGEW VAPTLSQKAE NIKAAAINGK I STGNTGLWL 

151 KEKTSIINVR GMLPDHTLLG IKIWARNDKN ELAEAVEADS AVLNSDGSWQ 

15 201 LKNIRRSIMG TDKIETSAAA EETWPIAVRR NLMDVLLVKP DQMSVGELTT 

2 51 YIRHLQNNSQ NTQIYAIAWW R KLVYPVAAW VMALVAFAF T PQTTRHGNMG 

3 01 LKLFGGI CLG LLFHLAGRLF GFTSQLYGTP PFLAGALPTI AFALLAVWLI 
351 RKQEKR* 

20 ORF1 12ng (SEP ID NO: 892) and ORF1 12-1 (SEP ID NO: 888) show 94.2% identity in 326 aa 
overlap: 



10 20 30 40 50 60 

or f 1 1 2 ng MNL I SR Y 1 1 RQMAVMAVYALLAFLAL YS FFE I L YETGNLGKGS YG I WEMLG YTALKMP AR 

II II II II M I II II II I I I I I II I I II I I II M II II Ml I II I I I II II II M II I I 
25 orf 112-1 MNL I SRY 1 1 RQMAVMAVYALLAFLALYS FFE I L YETGNLGKGS YG I WEMLG YTAL KM PAR 

10 20 30 40 50 60 



70 80 90 100 110 120 

orf 112ng AYELMPLAVLIGGLASLSQLAAGSELAVIKASGMSTKKLLLILSQFGFIFAIAAVALGEW 

I I I : I I I I I I I I h I I I I I I I I I hi I I I I I I I I I I I I I I I I M I . I I I I h I I I I I 
30 orf 112-1 AYELIPLAVLIGGLVSLSQLAAGSELTVIKASGMSTKKLLLILSQFGFIFAIATVALGEW 

70 80 90 100 110 120 



130 140 150 160 170 180 

orf 112ng VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKTS I INVRGMLPDHTLLG IKIWARNDKN 

MM II II 1 1 III II I II II Mill Ml II llhl MM II III II III Illlllll I 

35 orf 112-1 VAPTLSQKAENI KAAAINGKI STGNTGLWLKEKNSX INVREMLPDHTLLGI KIWARNDKN 

130 140 150 160 170 180 



190 200 210 220 230 240 

o r f 1 1 2 ng ELAEAVEADS AVLNS DGS WQLKN I RRS I MGTDKI ETS AAAEETWP I AVRRNLMDVLLVKP 

I I I I II I I , I I II I I I I I I I I I I I I : | Ihhl lllhllhhIIIIIIIMII 
40 orf 112 - 1 ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEENWPISVKRNLMDVLLVKP 

190 200 210 220 230 240 



250 260 270 280 290 300 

orf 112ng DQMSVGELTTYIRHLQNNSQNTQIYAIAWWRKLVYPVAAWVMALVAFAFTPQTTRHGNMG 

II 1 1 1 1 1 1 1 I I 1 1 I I I M I II h I I I 1 1 II 1 1 I 1 1 h I I I M I II I II I I I I 1 1 1 1 I I I I 

45 orf 112 - 1 DQMSVGELTTY I RHLQNNSQNTR I YA I AWWRKLVYPAAAWVMALVAFAFT PQTTRHGNMG 

250 260 270 280 290 300 



310 320 330 340 350 

orf 112ng LKLFGGICLGLLFHLAGRLFGFTSQLYGTPPFLAGALPTIAFALLAVWLIRKQEKRX 

Illlllll 1 1 1 i 1 1 i I I I I I I I I 1 

50 orf 112 - 1 LKLFGGI CXGLLFHLAGRLFGFTSQL 
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310 320 

This analysis suggests that these proteins from N. meningitidis and N. gonorrhoeae, and their 
epitopes, could be useful antigens for vaccines or diagnostics, or for raising antibodies. 

Example 105 

5 

Table HI lists several Neisseria strains which were used to assess the conservation of the sequence 
of ORF 4 (SEP ID NO: 216) among different strains. 

TABLE III - List of Neisseria Strains Used for Gene Variability Study of ORF 4 (SEP ID 
NO: 216) 



ORF4 gene variability: List of used Neisseria strains 
IdentificationStrains Source / reference 



mimhpr 

UUlllUVl 










Oroim R 

VII 1/1*1/ 






zv01_4 


NG6/88 


R. Moxon / Seiler et al., 1 


996 


zv02_4 


BZ198 


R. Moxon / Seiler et al., 1 


996 


zv03_4ass 


NG3/88 


R. Moxon / Seiler et al., 1 


996 


zv04_4 


297-0 


R. Moxon / Seiler et al., 1 


996 


zv05_4 


1000 


R. Moxon / Seiler et al., 1 


996 


zv06_4 


BZ147 


R. Moxon / Seiler et al., '. 


996 


zv07_4 


BZ169 


R. Moxon / Seiler et al., 1 


996 


zv08_4 


528 


R. Moxon / Seiler et al., 1 


[996 


zv09_4 


NGP165 


R. Moxon / Seiler et al., 


1996 


zvl0_4 


BZ133 


R. Moxon / Seiler et al., 


1996 


zvl 1_4 


NGE31 


R. Moxon / Seiler et al., 1 


[996 


zvl2_4ass 


NGF26 


R. Moxon / Seiler et al., 1 


1996 


zvl3_4 


NGE28 


R. Moxon / Seiler et al., 1 


1996 


zvl5_4 


SWZ107 


R. Moxon / Seiler et al., 1 


1996 


zvl6_4 


NGH15 


R. Moxon / Seiler et al., 1 


1996 


zvl7_4 


NGH36 


R. Moxon / Seiler et al., 1 


1996 


zvl8_4 


BZ232 


R. Moxon / Seiler et al., 1 


1996 


zvl9_4 


BZ83 


R. Moxon / Seiler et al., ] 


1996 


zv20_4 


44/76 


R. Moxon / Seiler et al., 1 


1996 


zv21_4 


MC58 


R. Moxon 




zv96_4 


2996 


Our collection 






Group A 






zv22_4 


205900 


R. Moxon 




z2491_4 


Z2491 


R. Moxon / Maiden et al. 


, 1998 
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zv24_4 
zv25_4 


Group C 

90/18311 
93/4286 


R. Moxon 
R. Moxon 


zv26_4ass 
zv27_4 
zv28_4 
zv29_4 


Others 

A22 (group W) 
E26 (group X) 
860800 (group Y) 
E32 (group Z) 


R. Moxon / Maiden et al., 1998 
R. Moxon / Maiden et al., 1998 
R. Moxon / Maiden et al., 1998 
R. Moxon / Maiden et al., 1998 


zv32_4Ng F62 

zv33_4 

fal090_4 


Gonococcus 

R. Moxon / Maiden et al, 1 998 
Ng SN4 R. Moxon 
FA 1090 R. Moxon 


References: 






Seiler A. et al, Mol. Microbiol., 1996, 19(4):841-856. 
Maiden et al., Proc. Natl. Acad. Sci. USA, 1998, 95:3140-3145. 



The amino acid sequences for each listed strain are as follows: 



>FA1090_4 [<SEQ ID 893>] (SEQ ID NO: 893) 

MKTFFKTLS AAALAL I LAACGGQKDS APAAS AAAP S ADNGAAKKE I VFGTTVGDFGDMVK 
5 EQIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEAF 
QVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARALVMLNELGWIKLKDGINPLTAS 
KADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQEPSFAYVNW 
S A VKT ADKDS QWL KD VT E A YN SDAF KA YAH KR F EG Y KY P AAWN EGAAK * 

>22491_4 [<SEQ ID 894>] (SEQ ID NO: 894) 
10 MKT F F KTL S AAALAL I LAACGGQ KD S APAAS AS AAADNGAAKKE I VFGTT VGD FGDMVKE 
QIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK 
ADI AENLKN I KI VELEAAQLPRSRADVDFAWNGNYA I S SGMKLTEAL FQEPS FAYVNWS 
AVKT AD KD S QWLKD VT E A YN S D A FKA Y AH KR F EG YKS P AAWNEGAAK * 

15 >ZV01_4 [<SEQ ID 895>] (SEQ ID NO: 895) 

MKT F F KTL S AAALAL I LAACGGQ KD S A PAAS AS AAADNGAAKKE I VFGTT VGD FGDMVKE 
QIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK 
AD I AENLKN I KI VELEAAQL PRSRADVDFAWNGNYAI S SGMKLTEALFQE PS FAYVNWS 

20 AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAK* 

>2V02_4 [<SEQ ID 896>] (SEQ ID NO: 896) 

MKTFFKTLSAAALAL I LAACGGQKDS APAAS AS AAADNGAEKKE I VFGTTVGDFGDMVKE 
HIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK 
25 AD I AENLKN I K I VE L E AAQL PRS RADVDF A WNGN YA I S S GMKLT E AL FQEPS FAYVNWS 
AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAARNEGAAK* 

>ZV03_4ASS [<SEQ ID 897>] (SEQ ID NO: 897) 

MKT FFKTLS AAALAL I LAACGGQKDS APAAS ASAAADNGAEKKE I VFGTTVGDFGDMVKE 
HIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
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VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK 
ADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQEPSFAYVNWS 
AVKT ADKDS QWLKD VT E A YNS DAF KA Y AH KRF EG YKS P AAWN EGAAK* 

>ZV04_4 [<SEQ ID 898>] (SEQ ID NO: 898) 
5 MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAEKKEIVFGTTVGDFGDMVKE 
HIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKS LEE VKDGSTVSAPNDPSNFARVL VMLDELGW I KLKDGINPLTASK 
ADI AENLKNI KI VELEAAQL PRSRADVDFAWNGNYAI S SGMKLTEALFQE PS FAYVNWS 
AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAK* 

10 >ZV05_4 [<SEQ ID 899>] (SEQ ID NO: 899) 

MKT F F KTL S AAALAL I LAACGGQKDS A P AAS AS AAADNG AE KKE I VFGTT VGD FGDMVKE 
H I Q P E L E KKG YTVKL VE FTD YVRPNLALAEG E LD I NVFQHKP YLDDF KKEHNLD I T E V FQ 

V P T A P LGL Y PGKL KS L E E VKDGS T VS APND P S N F ARVL VMLDELGW I KLKDG I N P LT AS K 
AD I AENLKN I KI VELEAAQL PRSRADVDFAVVNGNYA IS SGMKLTEALFQE PS FAYVNWS 

1 5 A VKT ADKDS QWLKD VT E A YN S DAF KAY AH KR FEG YKS PAAWNEGAAK* 

>ZV06_4 [<SEQ ID 9.00>] (SEQ ID NO: 900) 

MKT FFKTLS AAALAL I LAACGGQKDSAPAASASAAADNGAEKKEI VFGTT VGD FGDMVKE 
QIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKS LEE VKDGSTVSAPNDPSNFARVL VMLDELGW I KLKDGINPLTASK . 
20 ADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQE PS FAYVNWS 
AVKT AH KDS QWLKD VT E A YN S D AF KA YAHKRF EG YKS PAAWNEGAAK* 

>ZV07_4 [<SEQ ID 901>] (SEQ ID NO: 901) 

MKTFFKTLS AAALAL I LAACGGQKDS APAASASAAADNGAAKKE I VFGTTVGDFGDMVKE 
Q I QAELEKKGYTVKLVE FTD YVRPNLALAEGELDI NVFQHKP YLDDFKKEHNLDI TEVFQ 
25 VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGW I KLKDGINPLTASK 
ADI AENLKNIKIVELEAAQLPRSRADVDFAWNGNYAI SSGMKLTEALFQEPS FAYVNWS 
A VKTADKD S QWL KD VTE AYNS DAF KA Y AH KR FE G YKS PAAWNEGAAK* 

>ZV08_4 [<SEQ ID 902>] (SEQ ID NO: 1107) 

MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAEKKEIVFGTTVGDFGDMVKE 
30 H I QPELEKKGYTVELVEFTDYVRPNLALAEGELDINVFQHKP YLDDFKKEHNLDI TEVFQ 

V PT A PLGLYPGKLKSLE E VKDGST VS A PND P S N F ARVL VMLDE LG W I KLKDG I N P LT AS K 
AD I AENLKN I KI VELEAAQL PRSRADVDFAWNGNYAI S SGMKLTEAL FQE P S FAYVNWS 
A VKT ADKD S QWLKDVTE A YNSDAFKA YAH KR FEG YKS PAAWNEGAAK* 

>ZV09_4 [<SEQ ID 902>] (SEQ ID NO: 902) 
35 MKTFFKTLSAAALAL I LAACGGQKDSAPAASASAAADNGAEKKEI VFGTTVGDFGDMVKE 
H I Q P ELE KKG YTVKL VE FTD YVR PNLALAEG E LD I NVFQH K P YLDD FKKEHNLD I T E VFQ 
VPTAPLGLYPGKLKS LEE VKDGSTVSAPNDPSNFARVL VMLDELGW I KLKDGINPLTASK 
AD I AENLKNI KI VELEAAQLPRSRADVDFAWNGNYAI S SGMKLTEALFQE PS FAYVNWS 
AVKT AD KDSQWLKDVTEA YNSDAFKA YAHKRFEGYKS PAAWNEGAAK * 

40 >ZV10_4 [<SEQ ID 903 >] (SEQ ID NO: 903) 

MKT F FKTLS AAALAL I LAACGGQKDS APAASASAAADNGAAKKE I VFGTTVGDFGDMVKE 

HIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 

VPTAPLGLYPGKLKS LEE VKDGSTVSAPNDPSNFARVL VMLDELGW I KLKDGINPLTASK 

ADI AENLKNI KI VELEAAQL PRSRADVDFAWNGNYAI S SGMKLTEALFQE PS FAYVNWS 
45 AVKT AD KDS QWL KD VT E A YN S DAF KA YAHKRF EGYKS PAAWNEGAAK* 

>ZV11_4 [<SEQ ID 904>] (SEQ ID NO: 904) 

MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAAKKEIVFGTTVGDFGDMVKE 
Q I QVELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKP YLDDFKKEHNLDI TEVFQ 
VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGW I KLKDGINPLTASK 
50 ADI AENLKN I KI VELEAAQL PRSRADVDFAWNGNYAI SSGMKLTEALFQEPS FAYVNWS 
AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAK* 

>ZV12_4ASS [<SEQ ID 905>] (SEQ ID NO: 905) 

MKT F F KTLS AAALAL I LAACGGQ KDRA P AAS AS AAS ENGAAKKE I L FGTTVGDLGDMVKE 
QI QAELEKKGYTVKLVE FTD YVRPNLALAEGELD I NVFQHKPYLDDFKKEHNLD I TEVFQ 
55 VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVL VMLDELGW I KLKDGINPLTASK 
AD IAENLKNIKI VE L E AAQL PR S RAD VD F A WNGNY A I S S GMKLTEAL FQ E PS F AYVNW S 
AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAK* 
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>ZV13_4 [<SEQ ID 906>] (SEQ ID NO: 906) 

MKT F F KTLS AAALAL I LAACGGQKDS A P AAS AS AAADNGAAKKE I VFGTT VGD FGDMVKE 
QIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK 
5 ADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQEPSFAYVNWS 
AVKT ADKDS QWLKDVTEAYN SD AF KA YAH KRF EG YKS P AAWNEGAAK * 

>ZV15_4 [<SEQ ID 907>] (SEQ ID NO: 907) 

MKT F F KTLS AAALAL I LAACGGQKD S A P AAS AS AAADNG AE KKE I VFGTT VGD FGDMVKE 
HIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
10 VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK 
ADI AENL KN I K I VEL E AAQL P RS RAD VD F AWNGNY A I S S GMKLTE AL FQ E P S F AYVNWS 
AVKTADKDS Q WLKD VT EAYNS DAF KAYAH KRF EG YKS P AAGN EGAAK * 

>ZV16_4 [<SEQ ID 908>] (SEQ ID NO: 908) 

MKTFFKTLS AAALAL I LAACGGQKDS APAASASAAADNGAEKKE I VFGTTVGDFGDMVKE 
1 5 H I Q PE LE KKG YT VKL VE FTD YVR PNLALAEGE LD I NVFQH K P YLDD F KKEHNLD I T E VFQ 

V PT AP LGLY PGKL KS LE E VKDG S T VS A PND P S NF AR VL VMLDE LGW I KL KDG I N PLT AS K 
ADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQEPSFAYVNWS 
AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAK* 

>ZV17_4 [<SEQ ID 909>] (SEQ ID NO: 909) 
20 MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAEKKE I VFGTTVGDFGDMVKE 
QIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 

V PTAPLGL Y PGKL KS L E E VKDG S TVS A PND P S N F AR VL VMLD ELGW I KLKDG I N PLT AS K 
AD I AENL KN I K I VE LE AAQL P R S RAD VD F A WNGNYA I S SGMKLTE AL FQE P S F AYVNWS 
AVKT ADKDS Q WL KD VT E A YNS DAF KAYAH KRF EG YKS P AAWNEGAAK* 

>ZV18_4 [<SEQ ID 910>] (SEQ ID NO: 910) 

M KT FF KTL S AAALAL I LAACGGQ KD SAP AAS AS AAADNG AE KKE I VFGTTVGDFGDMVKE 
HIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK 
ADI AENL KN I K I VEL E AAQL PRS RAD VD FA WNGNYA I S S GM KLT E AL FQE P S F AYVNW S 
A VKT ADKDSQWL KD VT E A YNS DAF KAYAH KRF EG YKS P AAWNEGAAK* 

>ZV19_4 [<SEQ ID 911>] (SEQ ID NO: 911) 

MKTFFKTLS AAALAL I LAACGGQKDS APAAS AS AAADNGAAKKE I VFGTTVGDFGDMVKE 
QIQAELEKKGYTVELVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 

V PT A PLGL Y PGKLKS L E E VKDG S T VS A PND P S NFARVL VMLDE LGW I KL KDG I N PLT AS K 
35 ADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQEPSFAYVNWS 

AVKT AD KD S QWLKDVT EA YN S DAF KAYAHKRF EG YKS P AAWNEGAAK* 

>ZV20_4 [<SEQ ID 912 >] (SEQ ID NO: 912) 

MKTFFKTLS AAALAL I LAACGGQKDS APAAS AS AAADNGAAKKE I VFGTTVGDFGDMVKE 
QIQAELEKKGYTVELVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
40 VPTAPLGLY PGKLKS LEE VKDGSTVSAPNDPSNFARVLVMLDELGW I KLKDG INPLTASK 
ADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQEPSFAYVNWS 
AVKT ADKD S QWLKDVTEAYN S DAF KA Y AHKRF EG YKS P AAWNEGAAK * 

>ZV21_4 [<SEQ ID 913>] (SEQ ID NO: 913) 

MKT F FKTLS AAALAL I LAACGGQKDS APAAS AS AAADNGAAKKE I VFGTTVGDFGDMVKE 
45 QIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKS LEE VKDGSTVSAPNDPSNFARVLVMLDELGW I KLKDG INPLTASK 
AD I AENLKN I KI VELEAAQL PRS RADVDFAWNGNYAI S SGMKLTEALFQE PSFAYVNWS 
AVKT ADKD S QWL KD VT EAYNS DAF KAYAH KR FEG YKS P AAWNEGAAK* 

>ZV22_4 [<SEQ ID 914>] (SEQ ID NO: 914) 
50 MKTFFKTLSAAALAL I LAACGGQKDS APAAS AS AAADNGAAKKE I VFGTTVGDFGDLVKE 
QIQPELEKKGYTVELVEFTDYVRPNLALGEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGW I KLKDG INPLTASK 
AD IAENLKNIKI VEL E AAQL PRS RAD VD FA WNGNYA I S SGMKLTE AL FQE P S F AYVNW S 
AVKT ADKDS QWL KDVT E A YNS DAF KAYAH KRF EG YKS P AAWNEGAAK* 

55 >ZV24__4ASS [<SEQ ID 915>] (SEQ ID NO: 915) 

MKTFFKTL S AAALAL I LAACGGQKDS APAAS AS AAADNGAE KKE I VFGTTVGDFGDMVKE 
HIQPELEKKGYTVELVEFTDDVRPNLALGEGELDIIVFQHKPYLDDFKKEQNLDITEVFQ 
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V PT A PLGL Y PG KLKS LE E VKDG S T VS APND P S N F AR VL VMLDE LGW I KLKDG I N PLT AS K 
ADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQEPSFAYVNWS 
AVKT ADKDS QWLKD VT E A YNS DAF KAY AH KRF EG YKS P AAWNEG AAK * 

>ZV25_4 [<SEQ ID 916>] (SEQ ID NO: 916) 
5 MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAEKKEIVFGTTVGDFGDMVKE 
QIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 

V PT APLGL Y PGKL KS L E E VKDGS T VS A PND P SN F ARAL VMLDELGW I KLKDG I N PLT AS K 
AD I AENL KN I K I VEL E AAQL PRS RADVD FA WNGNYA I S S GMKLT E AL FQE P S FAYVNWS 
AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEG YKS PAAWNEGAAK* 

10 >ZV26_4 [<SEQ ID 917>] (SEQ ID NO: 917) 

MKT F FKTL S AAALAL I LAACGGQ KDS AP AAS AS AAADNG AE KKE I VFGTTVGD FGDMVKE 

HIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 

VPTAPLGLYPGKLKSLEEVKDGSTVS APND PSNFARVLVMLDELGWI KLKDG IN PLTASK 

AD I AENLKN I KI VELEAAQL PRSRADVDFAWNGNYAI S SGMKLTEALFQE PS FAYVNWS 
1 5 AVKT ADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKS PAAWNEGAAK* 

>ZV27_4 [<SEQ ID 918>] (SEQ ID NO: 918) 

MKTF FKTLS AAALAL I LAACGGQKDSAPAAS AS AAADNGAAKKE I VFGTT VGDFGDMVKE 
QIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWI KLKDGINPLTASK 
20 ADIAENLKNIKIVELEAAQLPRSRADVDFAWNGNYAISSGMKLTEALFQEPS FAYVNWS 
A VKT ADKD S QWL KDVT E A YN S DAF KA YAH KR F EG YKS PAAWNEGAAK * 

>ZV28_4 [<SEQ ID 919>3 (SEQ ID NO: 919) 

MKTFFKTLS AAALAL I LAACGGQKDSAPAAS ASAAADNGAEKKE I VFGTTVGD FGDMVKE 
H I Q P EL E KKG YT VKL VE FTD YVR PNLALAEGE LD I NV FQHKP YLDD F KKE HNLD I T E VFQ 
25 VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVL VMLDELGW I KLKDGINPLTASK 
ADIAENLKNI KI VELEAAQL PRSRADVDFAWNGNYAI S SGMKLTEALFQE PS FAYVNWS 
AVKT ADKDS QWL KDVTEAYNSDAFKAYAHKRF EG YKS PAAWNEGAAK* 

>ZV29_4 [<SEQ ID 920>] (SEQ ID NO: 920) 

MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAAKKEIVFGTTVGDFGDMVKE 
30 QIQVELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK 
ADIAENLKNI KI VELEAAQL PRSRADVDFAWNGNYAI SSGMKLTEALFQE PS FAYVNWS 
AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAK* 

>ZV32_4 [<SEQ ID 921>] (SEQ ID NO: 921) 
3 5 MKTFFKTLSAAALALI LAACGGQKDSAPAAS AAAPSADNGAAKKE I VFGTTVGD FGDMVK 
' EQIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEAF 
QVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARALVMLNELGWIKLKDGINPLTAS 
KADI AENLKN I KIVELEAAQLPRS RADVD FAWNGNYAI S SGMKLTEALFQE PS FA YVNW 
S AVKT AD KD S QWL KD VTE A YNS DAF KAYAHKRF EG YKY PAAWNEGAAK* 

40 >ZV33_4 [<SEQ ID 922>] (SEQ ID NO: 922) 

MKTFFKTLS AAALAL I LAACGGQKDSAPAAS AAAPSADNGAAKKE I VFGTTVGD FGDMVK 

EQIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEAF 

QVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARALVMLNELGWIKLKDGINPLTAS 

KAD I AENL KN I K I VE LE AAQL PRS RADVDF A WNGNYA I S S GMKLT E AL FQE P S FA YVNW 
45 S AVKT ADKDS QWLKD VT E A YNS DAF KA YAH KRFEG YKY PAAWNEGAAK * 

>ZV96_4 [<SEQ ID 923>] (SEQ ID NO: 923) 

MKTF FKTLS AAALAL I LAACGGQKDSAPAAS AS AAADNG AE KKE I VFGTTVGDFGDMVKE 
QIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ 
VPTAPLGLYPGKLKSLEEVKDGSTVS APND PSNFARVLVMLDELGWI KLKDGINPLTASK 
50 AD I AENL KN I K I VEL E AAQL PRS RAD VD F AWNGNY A I S SGM KLT EAL FQE P S FAYVNWS 
A VKT AD KDS QWL KDVT E A YN SDAFKAY AHKRF EG YKS P AAWN EGAAK * 

Figure 8 shows the results of aligning the sequences of each of these strains. Dark shading 
indicates regions of homology, and gray shading indicates the conservation of amino acids with 
55 similar characteristics. As is readily discernible, there is significant conservation among the 
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various strains of ORF 4 (SEP ID NO: 216) , further confirming its utility as an antigen for both 
vaccines and diagnostics. 

It will be appreciated that the invention has been described by means of example only, and that 
modifications may be made whilst remaining within the spirit and scope of the invention. 
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ABSTRACT 

The invention provides proteins from Neisseria meningitidis (strains A & B) and from Neisseria 
gonorrhoeae, including amino acid sequences, the corresponding nucleotide sequences, expression 
data, and serological data. The proteins are useful antigens for vaccines, immunogenic 
5 compositions, and/or diagnostics. 
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